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Preface 


This book grew out of lecture notes for my master’s courses on general relativity (GR) and 
on singularities and black holes taught at Radboud University (Nijmegen). These notes were 
originally intended for our students with a double bachelor degree in mathematics and physics, 
but in its final form the book is intended for all students of GR of any age and orientation who have 
a background including at least first courses in special and general relativity, differential geometry, 
and topology.! The recent textbook Elements of General Relativity by Chrusciel (2019) would 
make students singularly well prepared for this one, but almost any introduction to GR, combined 
with the typical mathematical background in manifolds etc. that is usually included in such 
introductions, will do. This book, then, is a second, mathematically oriented course in general 
relativity, with extensive references and occasional excursions in the history and philosophy of 
gravity, including a relatively lengthy historical introduction. As such, it omits standard physics 
material like the classical tests etc. Furthermore, the material is developed in such a way that 
through the last two chapters the reader may acquire a taste of the modern mathematical study of 
black holes initiated by Penrose, Hawking, and others, so that successful readers might be able 
to begin reading research papers in this direction, especially in mathematical physics and in the 
philosophy of physics. This focus comes with an introduction to what is called causal theory, 
but alas, it also implies that in order to keep the book medium-sized I had to omit applications 
like cosmology and gravitational waves. In any case I hope the book appeals to mathematicians, 
physicists, and philosophers—perhaps even historians—of physics alike. 

My own experience is that a really deep field such as GR (or quantum theory) can only be 
learned by reading a large number of books saying the right things in different ways, as well 
as by talking to good people working in the field. As a reader, my first encounter with GR was 
Einstein’s own exposition Relativity: The Special and General Theory (Einstein, 1921), which is 
still in print. In the summer of 1981, having just graduated from highschool, this was followed 
by two books that were a little more difficult, namely Space - Time - Matter by Weyl (1922) and 
The Mathematical Theory of Relativity by Eddington (1923), both of which are not only highly 
mathematical but also profoundly philosophical in spirit. Weyl makes this point himself: 


At the same time it was my wish to present this great subject as an illustration of the 
intermingling of philosophical, mathematical, and physical thought, a study which is dear 
to my heart. This could only be done by building up the theory systematically from the 
foundations and by restricting attention throughout to the principles. But I have not been 
able to satisfy these self-imposed requirements: the mathematician predominates at the 
expense of the philosopher.” (Weyl, 1918, Preface) 


Indeed, Weyl, Eddington and Einstein were natural philosophers in the spirit of the scientific 
revolution, whose mix of physics, mathematics, and philosophy was the key to its success. Hence 
it seems hardly a coincidence that Einstein was Newton’s successor, for if any scientific theory 
has ever represented the Philosophiae Naturalis Principia Mathematica, it must be GR. 


' Logically speaking, the GR material is even developed from scratch, and indeed the first course in this direction 
that I taught was optimistically offered also to mathematics students without any physics background. But experience 
shows that the material makes little sense without some prior exposure to both special and general relativity. 

?‘Zugleich wollte ich an diesem Großem Thema ein Beispiel geben für die gegenseitige Durchdringung 
philosophischen, mathematischen und physikalischen Denkens, die mir sehr am Herzen liegt; dies konnte nur durch 
einen völlig in sich geschlossenen Aufbau von Grund auf gelingen, der sich durchaus auf das Prinzipielle beschränkt. 
Aber ich habe meinen eigenen Forderungen in dieser Hinsicht nicht voll Genüge tun können: der Mathematiker 
behielt auf Kosten des Philosophen das Übergewicht.” Translation: Henry L. Brose (Weyl, 1922). 
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Preface 


So even though at the time I understood almost nothing of the technical content of their 
books, Einstein, Weyl, and Eddington left an indelible mark in the way they approached natural 
science through mathematics and philosophy. Still during that same long summer vacation 
between highschool and university in 1981, which I regard as one of the high points of my life, I 
also bought Gravitation by Misner, Thorne & Wheeler (1973). For a while I considered this the 
greatest book written on any topic whatsoever,’ and when Misner, well in his eighties at the time, 
not only came to a talk I gave in one of Bub’s New Directions in the Foundations of Physics 
conferences in Washington DC but even asked a question, after he had answered positively to 
my counter-question if he was Charles Misner I was petrified and unable to say anything.* 

My next book was The Large Scale Structure of Space-Time by Hawking & Ellis (1973), and 
so on, until General Relativity and the Einstein Equations by Choquet-Bruhat (2009) and most 
recently The Geometry of Black Holes by Chrusciel (2020). These are all masterpieces written 
by founders of the field; like most students and authors in mathematical GR I am also indebted to 
Penrose (1972), O’ Neill (1983) and Wald (1984). Furthermore, Earman (1995) set the stage in 
the philosophy of physics. Other influences on this text include Weinberg (1972), Kriele (1999), 
Poisson (2004), Schoen (2009), Gourgoulhon (2012), Malament (2012), and Minguzzi (2019). 

This brings me to the question why an author who so far wrote little on GR is entitled to write 
a book about the subject-even if it has been an almost lifelong passion. In the first of the Jeeves 
and Wooster episodes (about an indolent English aristocrat and his butler), Wooster’s aunt asks: 


Do you work, Mr Wooster? 
upon which Wooster (i.e. the aristocrat), taken aback by her question, mumbles: 
Well, I’ve known a few people who work. 


I’ve known a few people who work, too (in GR, that is). The greatest of these, in my view, is 
Roger Penrose, to whom this book is dedicated in honour of his pivotal role in the creation of 
mathematical relativity and the modern theory of singularities and black holes,’ combined with a 
singular lack of pomp and circumstance, for a scientist of his calibre. In her recent autobiography, 
Yvonne Choquet-Bruhat, who has known Penrose for over 50 years, puts it well: 


In spite of his successes, he remains a man without pretension, open and friendly. He came 
to listen, a few years ago, to a talk I gave at a seminar in Oxford. Afterwards we had lunch 
with a few colleagues and the conversation turned to the publication of his complete works. 
Penrose said: ‘My problem is to know if I must correct my mistakes before publication.’ It 
is a great quality to recognize a mistake, even small. Few human beings, scientists or not, 
are ready to do it. (Choquet-Bruhat, 2018, chapter 10) 


Perhaps the key to his success, which on the one hand seems typical for most great scientists and 
artists but on the other hand seems paradoxical as a path to influence and eminence, is this: 


3Kaiser (2012) gives an interesting perspective on Gravitation and its history, which confirms its uniqueness. 

4Nonetheless, I now see a basic drawback of Gravitation: with its xxvi + 1279 pages, it leaves no room for the 
reader (except in doing the exercices, which I all duly did in the next few years), who is overwhelmed and cornered. 

5A scientific biography of Penrose remains to be written (in 2019 Dennis Lemkuhl conducted a series of 
interviews with Penrose). For now, see e.g. Thorne (1994), Frauendiener (2000), Friedrich (2011), and Ellis (2014). 
Both the written AIP interview by Lightman (1989) and the videotaped interview by Turing’s biographer and 
Penrose’s former student Hodges (2014) are great and intimate portraits of Penrose. 
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It was important for me always, if I wanted to work on a problem, to think I had a different 
angle on it from other people. Because I wasn’t good at following where everybody else 
went. I wasn’t the kind of person who could pick up the prevalent arguments and knowledge 
of the time. Other people were good at that. They could suck it all out and put it together and 
make advances. I was the kind of person who’d have some kind of quirky way of looking at 
something on my own, which I would hide away and work at. So it meant that I had to have 
some way of looking at a problem that was my own. 


Here one would like to emphasize that the word ‘looking’, used twice, should be taken quite 
literally: as he also emphasized elsewhere, Penrose is primarily a visual thinker. This is 
exemplified most famously by his invention of the diagrams named after him, but it goes back 
a long way, including for example the “impossible figures” he created with his father, and his 
interaction with the Dutch artist Maurits Cornelis Escher (1898-1972).’ Penrose usually drew 
his own figures in a professional, yet playful and characteristic way, and each of them not only 
makes some scientific point but is also a pleasure to look at. A few are reproduced in this book. 

I first heard Penrose speak in Cambridge in 1989 about his recent book The Emperor’s New 
Mind: Concerning Computers, Minds and The Laws of Physics, later superseded by the equally 
controversial Shadows of the Mind: A Search for the Missing Science of Consciousness (1994). I 
got to know him personally during a Seven Pines Symposium in Minnesota in 2005, where the 
organizers had the luminous idea that famous and ordinary participants share an apartment. I am 
not sure to which one of us this arrangement was initially more shocking, but we got along well, 
and he very kindly came to the opening conference of our institute IMAPP at Nijmegen (2005) 
as a speaker (forming part of a stellar line-up including for example physicist Gerard ’t Hooft, 
mathematician Don Zagier, and theologian Hans Küng), where he explained the key ideas of 
his later book Cycles of Time (2010). Having him all for myself for 1.5 hours, I then drove him 
to the famous Amstel Hotel in Amsterdam, the most expensive hotel in the country, since I felt 
that if that is the place where Bob Dylan and the like stay, certainly also Roger belonged there.” 
He later returned to the Netherlands for a mathematical physics conference and usually came to 
my talks when I was visiting Oxford and joined for lunch or dinner whenever possible. 

Dominating the public image, Stephen Hawking was unquestionably another key figure in 
mathematical relativity.!° I observed Hawking on an almost daily basis between 1989-1997, 
when I was a postdoc at DAMTP in Cambridge, but I wasn’t in his group and never talked to him 
directly. I did mingle with his circle though, and inhaled a certain culture from this. Although in 
the wake of his Brief History of Time (1988) Stephen had by then become a scientific superstar, 
it is only after his death in 2018 that I really came to appreciate his genius and his life.!! 

Hence this book has been heavily influenced by Hawking and Penrose, and of course includes 
their singularity (i.e. incompleteness) theorems, but without being blind to other developments, 
notably the initial-value or PDE approach to GR, which, as will be explained in detail especially 
in connection with cosmic censorship, sometimes leads to a different perspective on space-time. 


© Quoted from the AIP interview by Lightman (1989). 

7See Wright (2014) for the history of Penrose diagrams; Wright (2013) explains the link with Escher. Penrose 
admired Escher at least since 1954, when, as a student participant to the International Conference of Mathematicians 
in Amsterdam, he saw an Escher exhibition. In 1962 Penrose visited Escher at his home in Baarn. See Penrose 
(2005), chapter 2, and the TV documentary Penrose (2015). See also $1.9 and footnote 486. 

8 At the conference dinner we gave each speaker an expensive Japanese wooden puzzle, which we asked them to 
solve as quickly as possible. Penrose won easily (which, in good spirits, greatly annoyed ’t Hooft and Zagier). 

°Our financial staff did not appreciate this and I paid for his room, with a river view doubling the prize, myself. 

10See $1.9 for some brief historical comments on the development of mathematical GR. 

See Hawking (1999) for an unusually honest account of this life; the second (2007) edition is milder. 
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Preface 


A complete coverage of the causal theory is both impossible in a work of this size and 
undesirable for students looking for a first encounter, but fortunately it is also unnecessary in 
view of the recent encyclopedic (Open Access) treatment by Minguzzi (2019), which always 
lies open on my desk. Similarly, a complete description of the PDE approach to GR would 
require not only a very different author (or rather a team of authors), but also much more space 
including preliminary material. So we are fortunate to have Ringström (2009) for those who 
want more than the very first introduction given here, as well as Klainerman & Nicolò (2003) and 
Christodoulou (2008). I have tried to do justice to the modern spirit of mathematical relativity, 
which is characterized by a mix of the causal and the PDE theories and culminates in the cosmic 
censorship and final state conjectures. The aim of this book is not at all to describe the latest 
news about such matters, but merely to explain what the discussions are about, and give students 
and more senior readers not specializing in this area and entry point to the research literature. 
Likewise for the no-hair or uniqueness theorems for black holes and black hole thermodynamics, 
with which the book ends. Thus the book stops not only where (mathematical or philosophical) 
research papers on classical GR begin, but also where quantum aspects of gravity begin. 

Finally, as may be expected more from a work in the humanities than in mathematical physics 
(between which the history and philosophy of physics resides), there are almost 700 footnotes, 
placed where the name “footnote” suggests they belong. They contain credits (e.g. for some of 
the arguments and derivations I give) and other pointers to the literature, as well as additional 
information that refines or qualifies the mathematics just discussed, and/or adds conceptual or 
historical information I found interesting. They may be skipped in principle by those who just 
want to hear the melody, but they seem to me to be essential for enjoying the full sound. 

For a more detailed summary of this book the prospective reader is encouraged to take a look 
at the synopsis and the table of contents, which in this order immediately follow this preface. 


I received very kind help and feedback from a number of students and colleagues, of whom 
I would like to mention Ibai Asensio Pol, Jeremy Butterfield, Erik Curiel, Jeroen van Dongen, 
Juliusz Doboszewski, John Earman, Jan Gtowacki, Evert-Jan Hekkelman, Leo Garcia Heveling, 
Michel Janssen, Dennis Lemkuhl, Martin Lesourd, Sera Markoff, Ettore Minguzzi, John Norton, 
Bryan Roberts, Quinten Rutgers, and Jan Sbierski. Most chapters were also reviewed during the 
2020-2021 Cambridge-LSE Philosophy of Physics Bootcamp, which was of great help. 


The final edit of this book was done during July 2021 at a lovely cottage by the river IJssel, 
which we could use thanks to the generous hospitality of our friends Arend and Esther van der 
Sluis. This last round also benefited from the online conference Singularity theorems, causality, 
and all that: A tribute to Roger Penrose, held in June 2021 (organized by Piotr Chrusciel, Greg 
Galloway, Michael Kunzinger, Ettore Minguzzi, and Roland Steinbauer) where I could pick up 
the latest news and was also given the unexpected honour to speak. My greatest debt, however, 
is to Edith de Jong, who contributed so much more than the beautiful cover art and various 
drawings to this book, including the one of Penrose on the dedication page of this book. !? 


This drawing is based on a photograph of Penrose in Gravitation (Misner, Thorne, and Wheeler, 1973, p. 936). 
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Synopsis 


Here is a brief summary of the chapters, which may also help potential readers as well as 
instructors. My own experience is that chapters 2, 3, 4, 5, and 7 may form the basis of a 
one-semester (master’s) course entitled Mathematical structure of general relativity.'” This may 
be followed by another one-semester course called Singularities and black holes,'* based on 
chapters 6, $$8.1-8.4, 9, and a selection of topics from chapter 10. For advanced students with 
sufficient background in both GR and mathematics, the latter could also stand alone.!° 

Since this book is also, perhaps even largely, intended for self-study and pleasure, it contains 
no exercises. However, instructors (and even students with enough self-discipline) can easily 
assign almost any derivation as an exercise (for themselves). Many difficult results are just 
mentioned without proof (always with a reference), and these could serve as advanced problems. 


1. Historical introduction. Based on recently completed research by historians of science, 
the reader is introduced to Einstein’s “bumpy road” to his theory of general relativity. 
Although GR may well be the most sublime of all scientific theories, created by a man 
who is widely-perhaps exaggeratedly-seen as one of the supreme geniuses humanity has 
brought about, at least the story of its discovery is “human, all too human”. I also include 
a little mathematical history, involving Riemann and others, as well as a brief picture of 
mathematical GR until about 1970. I close with some musings on general covariance. 


2. General differential geometry. This is a turbo introduction to manifolds and tensors, 
intended for readers who have already seen some basic treatment of this material. Even 
within some modern, coordinate-free approach to differential geometry, both abstract and 
computational aspects of GR also require the use of old-fashioned coordinates and indices. 


3. Metric differential geometry. Here the pace slows down. In this brief chapter, which is 
mainly a warm-up for the next two chapters, metrics, geodesics, connections, and the 
Levi-Civita (i.e. metric) connection are introduced. This material is totally standard, but I 
have done my best to give some perspective on geodesics in Lorentzian manifolds. 


4. Curvature. This chapter may have an unusual emphasis on sectional curvature, constant 
curvature, and the nineteenth century origins of the abstract modern theory in submanifolds 
of Euclidean space. In my experience, this background is especially helpful in understand- 
ing the Gauss—Weingarten and Gauss-Codazzi equations, which in turn are essential for the 
derivation of the constraint equations of GR. In the same spirit, the last section discusses the 
classical “fundamental theorem for hypersurfaces”, which gives necessary and sufficient 
conditions for the existence and (geometric) uniqueness of embeddings of curved surfaces 
in flat space. Though much simpler, this theorem resembles the corresponding result for 
the Einstein equations in $7.6, notably regarding the role of constraints. 


5. Geodesics and causal structure. This chapter introduces the topological and geometric 
techniques, largely developed by Penrose and others in the 1960s, that demarcate mathe- 
matical GR from a theoretical physics treatment. As already mentioned, our discussion is 
far from complete, but it is hopefully enough to advocate a specific causal way of thinking. 


!3Chapter 2 should perhaps not be discussed in detail (which might repel students); it alone has a summary (82.7). 

'4Nonetheless, putting chapter 6 before chapter 7 in this book is a logical choice, since the singularity theorems 
do not rely on the Einstein equations. The second half of chapter 8 contains advanced and partly speculative “retro” 
material that I simply find interesting—especially the unresolved problem of time—and could not resist including. 

!5Natärio (2021), which I saw much too late to use it, provides a one-semester course in all of mathematical GR. 


Synopsis 


The ensuing causal theory has turned out to be very fruitful for the modern study of both 
black holes and PDE aspects of GR. In both respects, one of the central ideas in all of 
mathematical GR is global hyperbolicity. This concept is studied from several equivalent 
definitions, which have been brought to completion only in the twenty-first century. 


6. The singularity theorems of Hawking and Penrose. Penrose’s singularity theorem from 
1965-which should more aptly be called an incompleteness theorem-remains the most 
powerful illustration of the techniques of the previous chapter. But for pedagogical reasons 
I start with Hawking’s singularity theorem (idem dito), which postdated Penrose’s but is 
easier since it does not involve the often counterintuitive “lightlike” (or “null’”) reasoning 
that is typical of Penrose’s theorem (and indeed of almost all of his work in GR). The 
opening pages of the chapter also provide some insight into the struggle of finding an 
appropriate definition of space-time singularities, from Einstein to Penrose and Hawking. 


7. The Einstein equations. In standard fashion, the Einstein (field) equations are derived from 
an action principle, with extra attention however for boundary terms. This also involves a 
brief treatment of matter sources (i.e. the energy-momentum tensor). The main goal of the 
chapter is to introduce the specific PDE analysis of the Einstein equations introduced by 
Choquet-Bruhat in the 1950s, which she completed in a joint paper with Geroch from 1969. 
This analysis provides and solves a geometric initial-value formulation for the Einstein 
equations, which is far from obvious and circumvents all kinds of conceptual and technical 
questions involving equations posed on a space-time that does not (yet) exist. 


8. The 3+1 split of space-time. For both technical and conceptual reasons-we do not experi- 
ence space-time but space and time-it is often helpful to take a “non-relativistic” view on 
the Einstein equations. This involves an arbitrary foliation of space-time into spacelike 
hypersurfaces, controlled by the lapse and shift functions of the physics literature (i.e. the 
ADM formalism). Thus the Einstein equations are cleanly split into propagation equations 
and constraint equations, and one has an easy transition to a Hamiltonian formalism. The 
last section concerns the deceptive “problem of time”, which is more or less debunked. 


9. Black holes I: Exact solutions. The theory of black holes is an interplay between abstract 
arguments, like Penrose’s singularity theorem and associated techniques, and concrete 
examples. This chapter is devoted to the latter. After a warm-up on de Sitter space (which is 
not singular but has some kind of horizon), which may be skipped, we study the three main 
cases of interest: Schwarzschild (including the Kruskal extension), Reissner-Nordström, 
and Kerr. Especially the latter is a source of endless fascination, which can only be sparked. 


10. Black holes II: General theory. Penrose remains a central figure in the model-independent 
study of black holes, e.g. through his four closely related concepts of conformal completion, 
null infinity, (absolute) event horizon (which leads to a mathematical definition of a black 
hole), and the diagram named after him; by means of cosmic censorship, and via the 
Penrose inequality. Furthermore, he unearthed the structure of various black hole horizons 
(namely event horizons, Cauchy horizons, and Killing horizons, in introducing all of which 
also Hawking played a major part) as null hypersurfaces ruled by lightlike geodesics. The 
last two sections on uniqueness or “no hair” theorems and on thermodynamics of black 
holes are introductory; alas, they merely scratch the surface of these miraculous topics. 


Finally, Appendix A on Lie groups, Lie algebras, and constant curvature mainly supports §4.4, 
whereas Appendix B on Formal PDE theory gives some background for especially §7.6. 
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1 Historical introduction 


On 25 November 1915, Einstein submitted a paper containing the above equations, which 
(in an appropriate mathematical context) state his general theory of relativity (GR). Einstein 
thereby replaced Newton’s theory of universal gravity from 1687, and in 1919 he became famous 
overnight when the historical expedition led by Eddington confirmed Einstein’s prediction— 
against Newton-of the gravitational deflection of starlight passing near the Sun. Einstein also 
computed the correct perihelion shift of Mercury, which had been known since 1859 as an 
anomaly in Newtonian gravity. Still within his conceptual reach were, for example, the properties 
of the binary pulsar PSR B1913+16, discovered in 1974 by Hulse and Taylor, as well as the 
gravitational waves detected by the LIGO experiment in 2015, almost a century after Einstein had 
predicted their existence. Beyond what Einstein himself foresaw or could bear, GR also turned out 
to describe the expansion of the cosmos and hence-unless quantum theory intervenes-its origin 
in a big bang (Einstein initially denied the former and never accepted the latter implication). !° 
Last but not least, GR suggests the possibility of black holes (another “singular” phenomenon 
allowed by his theory that Einstein stubbornly kept disavowing), and, in the hands of Penrose, 
gives compelling conditions for their existence. The fact that these conditions are met in the 
universe is now beyond any (astrophysical) doubt, as reconfirmed by the spectacular image of 
the supermassive black hole M87* revealed in 2019 by the Event Horizon Telescope (EHT). 
Hilbert, the greatest mathematician of his time, expressed his admiration for GR as follows: 


Constructing the theory of general relativity is, in my opinion, one of the greatest achieve- 
ments in scientific history. The edifice that Pythagoras started and Newton continued has 
been completed by Einstein.'’ (Hilbert, 1920) 


16 With hindsight the occurrence of a big bang is implicit in the specific solution to Einstein’s equations that 
describes an expanding homogeneous and isotropic universe, found independently by Friedman in 1922 and 
Lemaitre in 1927. The latter also matched this with contemporary observations of redshifts of galaxies (often but not 
quite rightly attributed to Hubble), and is the originator of the physical idea of a hot early state of the universe, which 
he proposed in the early 1930s. See e.g. Kragh (2007) and Nussbaumer & Bieri (2009). In 1965 the 2.7K Cosmic 
Microwave Background was discovered (by coincidence) by Penzias and Wilson. The CMB was interpreted almost 
at once as a relic of the big bang by Dicke and Peebles and others, which matched and revived earlier calculations by 
Gamow and others of the abundances of hydrogen and helium in stars. Within this Zeitgeist Hawking’s singularity 
theorem from 1966, which we will discuss in detail, gives the final mathematical underpinning of the big bang. 

'7 Die Aufstellung der allgemeinen Relativitätstheorie ist m.E. eine der größten Leistungen in der Geschichte 
der Wissenschaften. Den von Pythagoras begonnenen, von Newton ausgestalteten, Bau hat Einstein zum Abschluß 
gebracht.’ Quoted in Corry (1999, p. 522) from unpublished lecture notes from Hilbert’s 1920 course Mechanik und 
neue Gravitationstheorie. As we shall see in $1.7, the relationship between Einstein and Hilbert had ups and downs. 


Historical introduction 


Hilbert knew what he was talking about; we will return to his role in the history of GR (see $1.7), 
and more generally to the profound interaction between physics and mathematics in this theory. 
In 1918 Hilbert’s pupil Weyl, a contemporary of Einstein’s and later colleague of his at the 
Institute for Advanced Study in Princeton, wrote the first textbook about GR, called Raum - Zeit - 
Materie (Space - Time - Matter). Its preface starts in the following, even more lyrical way: 


Einstein’s Theory of Relativity has advanced our ideas of the structure of the cosmos a step 
further. It is as if a wall which separated us from Truth has collapsed. Wider expanses and 
greater depths are now exposed to the searching eye of knowledge, regions of which we had 
not even a presentiment. It has brought us much nearer to grasping the plan that underlies 
all physical happening.!8 (Weyl, 1918, Vorwort) 


Perhaps Einstein himself was not entirely neutral about his work, but he was as lyrical: 


The theory has unsurpassed beauty (...) My boldest dreams have been fulfilled (...) That I 
was given to experience this (...) The highest satisfaction of my life.!” 


It is all the more remarkable that Einstein found his theory under pretty miserable circumstances. 
First, Germany was in a war which he was one of the very few people (on either side) to oppose. 
This isolated him even among his colleagues, who in any case hardly understood his scientific 
quest. Second, he was separated from first wife (Mileva) and their two sons (Hans Albert and 
Eduard).”° Einstein’s ability to not only continue his work under such conditions but even 
produce one of the greatest scientific theories of all times was later explained as follows: 


His true passion lay in the understanding of the riddle of the immeasurable world, which 
stood outside and above the bickering and wriggling of personal interests, feelings, and 
urges of people. Seeking such understanding comforted him from the moment he had seen 
through the hypocrisy of the common ideals of decency. The contemplation of this external 
reality lured him like a liberation from an earthly prison.*! (Fokker, 1955) 


Einstein’s construction of GR was the culmination of an epic quest for the structure of space 
and time, which had started with his special theory of relativity from 1905. Unlike his earlier 
relativity theory, Einstein’s road to GR is extremely well documented.”” The summary that now 
follows suggests that the key to Einstein’ success was not some superhuman genius à la Newton 
but his ability to recognize inconsistencies (including his own mistakes) and take it from there. 


'8 “fit der Einsteinschen Relativitätstheorie hat das menschliche Denken über den Kosmos eine neue Stufe 
erklommen. Es ist, als wäre plötzlich eine Wand zusammengebrochen, die uns von der Wahrheit trennte: nun liegen 
Weiten und Tiefen vor unserm Erkenntnisblick entriegelt da, deren Möglichkeit wir vorher nicht einmal ahnten. Der 
Erfassung der Vernunft, welche dem physischen Weltgeschehen innewohnt, sind wir einen gewaltigen Schritt näher 
gekommen. ’ Translation: Henry L. Brose, from the fourth edition from 1922 (see also §1.8). 

1%. Die Theorie ist von unvergleichbarer Schönheit (...) Ich war einige Tagen fassungslos von Erregung (...) 
Die kühnsten Träume sind nun in Erfüllung gegangen (...) Dass ich das habe erleben dürfen (...) Die höchste 
Befriedigung meines Lebens’. These quotations can be found in the original German in Fölsing (1993), chapter 4. 

20On the other hand, whilst officially still married to Mileva he had started a relationship with Elsa Einstein, who 
also lived in Berlin (Einstein was in fact her maiden name; she was both a first and second cousin to Albert Einstein). 
They got married in 1919 after Einstein’s divorce from Mileva and stayed together until Elsa’s death in 1936. 

2!Dutch original: ‘Zijn ware hartstocht lag in het doorgronden van het raadsel der onmetelijke wereld, die buiten 
en boven het geharrewar en het gewriemel van persoonlijke belangen, gevoelens, en driften der mensen stond. Dat 
nadenken troostte hem toen hij de schijnheiligheid van de gangbare fatsoenlijke idealen had doorzien. Als een 
bevrijding uit een aardse gevangenis lokte hem de beschouwing van die buitenpersoonlijke werkelijkheid. 

~The main sources for the history of GR are Einstein (1996ab) and Renn (2007), the massive scholarship in 
which is based largely on the work of Michel Janssen, John Norton, Jürgen Renn, Tilman Sauer, and John Stachel. 
See also van Dongen (2010, 2017), Janssen (2014), and Janssen & Renn (2020). Standard biographies are Fölsing 
(1993) and Isaacson (2017). The only scientific biography of Einstein (Pais, 1982) is now largely outdated. 


From physical principles to a mathematical framework 


1.1 From physical principles to a mathematical framework 


In an illuminating informal lecture he gave on 14 December 1922 to students at Kyoto University, 
Einstein recalled his first steps towards what ultimately became general relativity: 


The first thought leading to the general theory of relativity occurred to me two years later, 
in 1907, and it did in a memorable setting. I was already dissatisfied with the fact that the 
relativity of motion is restricted to motion with constant relative velocity and does not apply 
to arbitrary motion. I had always wondered privately whether this restriction could somehow 
be removed. In 1907, while trying, at the request of Mr. Stark, to summarize the results of 
the special theory of relativity for the Jahrbuch der Radioaktivität und Elektronik of which 
he was the editor, I realized that, while all other laws of nature could be discussed in terms 
of the special theory of relativity, the theory could not be applied to the law of universal 
gravitation. I felt a strong desire to somehow find out the reason behind this. But this goal 
was not easy to reach. What seemed to me most unsatisfactory about the special theory of 
relativity was that, although the theory beautifully gave the relationship between inertia and 
energy, the relationship between inertia and weight, i.e., the energy of the gravitational field, 
was left completely unclear. I felt that the explanation could probably not be found at all 
in the special theory of relativity. I was sitting in a chair in the Patent Office in Bern when 
all of a sudden I was struck by a thought: “If a person falls freely, he will certainly not feel 
his own weight.” I was startled. This simple thought made a really deep impression on me. 
My excitement motivated me to develop a new theory of gravitation. My next thought was: 
“When a person falls, he is accelerating. His observations are nothing but observations in 
an accelerated system.” Thus, I decided to generalize the theory of relativity from systems 
moving with constant velocity to accelerated systems. I expected that this generalization 
would also allow me to solve the problem of gravitation. This is because the fact that a 
falling person does not feel his own weight can be interpreted as due to a new additional 
gravitational field compensating the gravitational field of the Earth, in other words, because 
an accelerated system gives a new gravitational field.” 


This recollection makes, and somewhat conflates or refers to, at least three different points: 


1. Einstein was haunted by the idea that the “principle of relativity’—which in his special 
theory (as well as in Newtonian mechanics) only applies to motion with constant velocity— 
should be extended to arbitrary motion, in that the laws of physics should be the same in 
any frame of reference.”* Since the special principle of relativity makes uniform motion 
relative, Einstein called his extended version the “general principle of relativity”, which he 
considered so important that he would later even name his entire theory after it. 


> The English translation of the original notes in Japanese taken by Einstein’s tour guide Yun Ishiwara comes 
from Einstein (2013), pp. 637-638. Another translation may be found in Physics Today, August 1932, p. 45. 

24 Tf true, this would make all kinds of motion in (otherwise) empty space indistinguishable, and hence both 
inertial and accelerated motion would effectively be undefined (as opposed to the situation in both Newtonian 
mechanics and, indeed, GR). As a way out, Mach proposed that motion is exclusively defined with respect to all 
(other) matter in the universe, even if this only consists of distant stars. Following Einstein himself, this is often 
called Mach’s principle, although Barbour (1989, Introduction), claims that Einstein misunderstood Mach’s idea by 
conflating its application to Newton’s first law (for which it was apparently intended) and his second (for which it 
was not). In any case, Einstein was initially guided by this principle, which until 1918 he did not clearly distinguish 
from both the equivalence principle and the principle of general covariance. Although Mach’s principle fails in GR, 
as Einstein gradually came to recognize after 1920 (for example, non-flat vacuum solutions to the Einstein equations 
violate it), it nonetheless helped Einstein significantly in his search for the field equations (Janssen, 2014). 
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2. 


3. 


Newton’s theory of gravity, based on action at a distance, was not only physically absurd 
(as Newton had already noted himself), but also stood in an uneasy relationship (to say 
the least) with special relativity, which postulates the velocity of light as the largest signal 
velocity (‘the theory could not be applied to the law of universal gravitation’). 


Finally, the mysterious equality of gravitational and inertial mass (which in Newtonian 
physics is a curious coincidence) led Einstein to introduce the following two-sided coin: 


(a) Freely falling observers, who according to someone at rest are accelerating, feel no 
gravity: they may even consider themselves at rest as if there were no gravity. 


(b) Accelerating observers in a situation without gravity may equally well consider 
themsleves at rest in a specific gravitational field (pulling in the opposite direction). 


Point 1 is controversial and warrants further discussion; see §1.10. Point 2 is resolved by GR; for 


example, test particles respond to the local structure of space-time by moving on geodesics. 
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Part 3(a) was Einstein’s flash of insight that elsewhere he called ‘the happiest thought of my 
life’.*° Part 3(b) is Einstein’s equivalence principle,” which he seems to have arrived at through 
3(a). Although at first sight 3(a) and 3(b) appear to be related by interchanging the perspectives 
of the two observers involved (and indeed are often conflated), in fact they are quite different: 


3(a) 


3(b) 


The modern way of phrasing this would be that in the frame of reference of an observer 
moving along a geodesic (i.e. a freely falling observer), on the geodesic the metric is 
the Minkowski metric and even its first derivatives vanish, so that also the Christoffel 
symbols vanish and hence the geometry of space-time as well as the motion of freely 
falling particles accompanying our observer are approximately flat.”® Nonetheless, the 
gravitational field is not completely “transformed away” for the freely falling observer: 
the Riemann curvature tensor (which involves second derivatives of the metric) is nonzero 
even on the geodesic, and tidal forces (which are described by the Riemann tensor) are 
still there (cf. §5.1). Einstein understood this well before GR was established and hence 
3(a) was not his equivalence principle; confusingly to many,” it was a heuristic towards it. 


Version (b), which in turn became a crucial heuristic for Einstein in finding GR, is dif- 
ferent from merely changing perspective in version (a). That would mean that instead 
of identifying with the freely falling observer, one identifies with Einstein sitting in his 
office, watching the man fall. Einstein then feels a pull downward by a gravitational force, 
which is what according to version (a) the freely falling man does not feel. However, 
Einstein’s equivalence principle takes place in Minkowski space-time,’ and states that a 
system undergoing constant acceleration may, as far as all laws of nature are concerned, 
equivalently considers itself at rest in a homogeneous gravitational field. Think of pushing 
the gas pedal in a car (preferably with blinded windows); the backward pull the driver 
feels from the acceleration is indistinguishable from a horizontal gravitational pull. 


25See Proposition 7.2 and footnote 319. One must concede that the idea is clearer than its mathematical execution. 

26“Da kam mir der gliicklichste Gedanke meines Lebens.” See Einstein (2002a), Doc. 31, page 265. 

27Two leading scholarly papers on Einstein’s equivalence principle are Norton (1985) and Lemkuhl (2019). 

28 See the end of §5.2 for the mathematical underpinning of this claim through Fermi normal coordinates. 

2 Perhaps starting with Pauli (1921, §51), who states that ‘For the general case, [the equivalence principle] can 
be formulated in the following way: For every infinitely small world region (...) there always exists a coordinate 
system (...) in which gravitation has no influence either on the motion of particles or any other physical processes. 

3°Tn 1907 Einstein had not yet internalized this concept and thought in terms of 3d “relative space”, but in later 
formulations he talks about “spacetime regions’ in ‘the limiting case of special relativity. See Norton (1985), §2. 


From physical principles to a mathematical framework 


We would now say that version (a) is about real gravitational fields, whereas version (b) is about 
fictitious gravitational fields, whose properties are studied from their claimed equivalence with 
the effects of accelerated motion in Minkowski space-time, as if they were real. But apparently 
Einstein, whose views were often different from our modern ones, thought of them as real!>! 

This can be made more precise by using comoving coordinates (with the accelerated observer), 
in which Christoffel symbols appear, of the same kind that locally describe “real” gravity (see also 
§1.10). Accordingly, Einstein’s strategy, effective from 1912 onwards, was to infer properties of 
real gravity from known properties of accelerated motion, reinterpreted as the effects of fictitious 
gravity (as we see it, or as real gravity as Einstein saw it) as felt by an observer believing 
to be at rest (instead of accelerating). Its most important application, dating from 1912, is 
Einstein’s crucial insight that gravity requires curved space-time, and hence should be based on 
differential geometry. This was the mathematical key to GR. With hindsight, the argument is 
quite straightforward: in the usual coordinates (t,x,y,z) the Minkowski line element is 


d? = —c" dt? +d + dy’ +dz’, (1.1) 
but in arbitrary coordinates (seen by Einstein as describing an arbitrary reference frame) it is 
ds* = Suv (x)dx*dx”, (1.2) 


where x = (x#) = (x9, x!,x?,x3), and the guy (x) are certain functions (jointly forming a Lorentzian 


metric at each point of space-time, as we would now say). According to version (b) of the equiv- 
alence principle, then, an observer who is accelerating with respect to the coordinates (t,x, y, z) 
may use comoving coordinates (x") in which he is entitled to feel at rest in a gravitational 
field. At the same time, his line element is (1.2) rather than (1.1), and so he attributes the 
effects of this field to the functions guy. Einstein’s leap of faith-one of the most successful in 
the history of science-was to generalize this argument from gravitational fields “caused” by 
acceleration—whatever their ontological status—to all gravitational fields, and hence claim that 
gravity is described by a metric tensor (or, as he later proposed, by its Christoffel symbols). 

In actual fact, this insight came to Einstein in two steps, the first related to curved time 
and the second to curved space. First, in a constantly and linearly accelerated reference frame 
(t',x’,y’,z’), which his equivalence principle was always about, the line element (1.2) becomes 


ds? = -° (x,y,z) (dr)? + (dx)? + (Ay)? + (ay, (1.3) 


so by version (b) of the equivalence principle gravity changes the metric in the time-like 
direction.** Second, consider a child moving on a merry-go-round, with a parent waiting outside. 
According to the latter, the child is accelerated inward, but according to the equivalence principle 
version (b) the child (who has studied general relativity) may claim to be at rest in a gravitational 
field that is radially outward directed and gives the centrifugal pull the child feels when it holds 
tight to the wooden horse. Now return to the parent, who knows special relativity (which without 
gravity is enough), and hence knows that moving objects contract in the direction of motion. 
This affects the length of measuring rods tangent to the circumference of the disc, but not those 
perpendicular to it (i.e. lying in the radial direction). Consequently, to the parent the ratio 
circumference/radius exceeds 27. By the equivalence principle this is also true for the prodigy, 
who thereby concludes that gravity requires non-Euclidean geometry, at least spatially.”° 


31 See Norton (1985) and Lemkuhl (2019). Accordingly, as part of his unholy alliance with Mach’s principle (see 
footnote 24). Einstein attempted to find a material source of this induced gravitational field “at infinity”. 

32This leads to the prediction of gravitational deflection of light, which was the most famous early test of GR. 

33 See Stachel (1980). Einstein’s argument is more complicated than it sounds, since the description of uniformly 
rotating solid discs in special relativity is tricky. However, it was just a heuristic and should be seen as such. 
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1.2 Riemannian geometry 


Fortunately, as a student at the ETH Zürich, where he studied from 1896-1900 to become a 
mathematics teacher (Fachlehrer in mathematischer Richtung), Einstein had taken courses in 
differential geometry (Infinitesimalgeometrie) and geometric invariant theory (Geometrische 
Theorie der Invarianten) from a good mathematician called Carl Friedrich Geiser (1 843-1934).°4 
Though elementary, these courses were exactly what Einstein needed in 1912 to orient himself 
towards the right mathematical framework for GR (as he later acknowledged): apart from 
connections to the theory of functions, the differential geometry course for example discussed 
topics like coordinate systems, surfaces, line elements, (Gaussian) curvature, and geodesics. 

Einstein’s second stroke of luck was that in August 1912 he had moved from Prague to Ziirich 
to become a professor at the ETH, where he renewed his friendship with his former ETH classmate 
Marcel Grossmann (1878-1936), who had been a professor of mathematics at the ETH since 
1907. Grossmann was an expert in non-Euclidean geometry and introduced Einstein to the latest 
developments in this area. Non-Euclidean geometry had started secretly with Carl Friedrich 
Gauss (1777-1855), one of the greatest mathematicians of all time, who however during his 
lifetime only published his technical work on lines and surfaces embedded in Euclidean space R? 
that launched the field of differential geometry (Gauss, 1828).°° This published work includes 
the description of curvature and the Theorema Egregium (i.e. ‘remarkable theorem’), which 
states that what we now call the Gaussian curvature of a surface, though initially defined via its 
embedding in R? (i.e. extrinsically), is in fact independent of the embedding and hence is an 
intrinsic property. See eq. (4.76) in §4.3. Thus a surface has both intrinsic and extrinsic curvature, 
a fact which-jumping ahead of Einstein-in one dimension higher (namely a three-dimensional 
space embedded in a four-dimensional space-time) will play a central role in the initial-value 
problem of GR (as well as in its closely related Hamiltonian formulation). See chapter 8. 

The work of Gauss was taken a decisive step further by his brilliant-perhaps even greater— 
student Bernhard Riemann (1826-1866). In his extraordinarily visionary Habilitation lecture on 
June 10, 1854, Riemann simply left out the ambient Euclidean space and also worked in arbitrary 
dimension.*’ This lecture starts in the following provocative way:”® 


34Finstein in fact rarely went to lectures and prepared himself for exams using notes taken by Grossmann. 

35See Sauer (2014) for a survey of Grossmann’s interaction with Einstein and his contributions to GR. 

36 very good introduction to this work is Volume 2 of Spivak (1999), chapter 3, which includes a translation. 

37 Also here Volume 2 of Spivak (1999), chapter 4, is an excellent introduction. Riemann’s lecture from 1854 was 
given for a non-mathematical audience including philosophers (but also the aging Gauss) and so it contains almost 
no equations. What we now call the Riemann tensor was first given in an initially unpublished prize essay on heat 
conduction, known among historians as the Commentatio, which first appeared in Riemann (1876), pp. 370-383. In 
this essay, Riemann models heat flow using a three-dimensional metric, whose local flatness (which turned out to be 
physically interesting) he relates to the vanishing of the curvature tensor, cf. Theorem 4.1 in the present book. See 
Farwell & Knee (1990) and Darrigol (2014). The transition from Gauss to Riemann is described in detail by Reich 
(1973), and is embedded in the general history of ninetheenth century geometry by Gray (2007). 

38 “Bekanntlich setzt die Geometrie sowohl den Begriff des Raumes, als die ersten Grundbegriffe fiir die 
Constructionen in Raume als etwas Gegebenes voraus. Sie giebt von ihnen nur Nominaldefinitionen, während die 
wesentlichen Bestimmungen in Form von Axiomen auftreten. Das Verhältniss dieser Voraussetzungen bleibt dabei 
in Dunkeln; man sieht weder ein, ob und in wie weit ihre Verbindung nothwendig, noch a priori, ob sie möglich ist. 
Diese Dunkelheit wurde auch von Euklid bis auf Legendre, um den berühmtesten neueren Bearbeiter der Geometrie 
zu nennen, weder von den Mathematikern, noch von den Philosophen, welche sich damit beschäftigten, gehoben. 
Es hatte dies seinen Grund wohl darin, dass der allgemeine Begriff mehrfach ausgedehnter Grössen, unter welchem 
die Raumgrössen enthalten sind, ganz unbearbeitet blieb. Ich habe mir daher zunächst die Aufgabe gestellt, den 
Begriff einer mehrfach ausgedehnten Grösse aus allgemeinen Grössenbegriffen zu construiren. Es wird daraus 
hervorgehen, dass eine mehrfach ausgedehnte Grösse verschiedener Massverhältnisse fähig ist und der Raum also 


Riemannian geometry 


It is known that geometry assumes, as things given, both the notion of space and the first 
principles of constructions in space. She gives definitions of them which are merely nominal, 
while the true determinations appear in the form of axioms. The relation of these assumptions 
remains consequently in darkness; we neither perceive whether and how far their connection 
is necessary, nor a priori, whether it is possible. 


From Euclid to Legendre (to name the most famous of modern reforming geometers) this 
darkness was cleared up neither by mathematicians nor by such philosophers as concerned 
themselves with it. The reason of this is doubtless that the general notion of multiple 
extended magnitudes (in which space-magnitudes are included) remained entirely unworked. 
I have in the first place, therefore, set myself the task of constructing the notion of a multiply 
extended magnitude out of general notions of magnitude. It will follow from this that a 
multiply extended magnitude is capable of different measure-relations, and consequently 
that space is only a particular case of a triply extended magnitude. (Riemann, 1854) 


As Gray (2007, p. 193) put it, ‘Euclid’s postulates are completely subverted: no longer can 
they be regarded as unproblematically true assumptions about physical space.’ Even in two 
dimensions, the (hyperbolic) non-Euclidean geometries discovered two decades earlier by Bolyai 
and Lobachevskii were far from the only possibilities (although, as Riemann mentioned, they do 
have special symmetry properties).”” Riemann’s main ideas, hardly formalized by him however, 
were firstly that of a manifold (described by him as an ‘n-fach ausgedehnte Grösse’, and later 
even as a ‘Mannigfaltigkeit’),4° and secondly that of a metric (defined on a manifold), which he 
made responsible for derived notions like distance, angles, geodesics, and curvature, and as such 
identified as the basis of geometry. Applied to space-time rather than space,*! this turned out to 
be exactly what was needed for GR. This gives the second great and remarkable example of a 
piece of mathematics that was initially developed for purely intrinsic (i.e. mathematical) reasons 
but later turned out to provide the right language for some profound new physical theory.*” 


nur einen besonderen Fall einer dreifach ausgedehnten Grösse bildet.” Translation by W.K. Clifford. 

Riemann did not name Bolyai and Lobachevskii and probably did not know their work. Yet one of the few 
formulas in his lecture gives the metric of hyperbolic space as an example of a space with constant curvature. 

40 The historical development of the concept of a manifold is described by Scholz (1980, 1999). The word 
‘Mannigfaltigkeit’ had been used by Gauss in lectures, but always in the context of subspaces of (what we now call) 
R”, and it really seems to have been Riemann who conceived the general notion, including hints towards global 
structure described by overlapping charts (one may even argue that his habilitation lecture foreshadowed both set 
theory and topology; for example, he explicitly left room for discrete as opposed to continuous structures, and in his 
earlier PhD thesis from 1851 Riemann had even talked about infinite-dimensional spaces of functions). However, as 
Scholz notes (p. 30), “The reception and assimilation of Riemann’s concept of a manifold to the mathematics of 
the 19th century was slow and inhibited by severe conceptual problems’. As we shall see, Ricci (and Levi-Civita) 
even turned the clock back by omitting any reference to global structure and basing their tensor calculus entirely 
on the use of coordinates without a specified domain (which was often implicitly taken to be R”). Nonetheless, 
in a development in which topology, geometry, and function theory can hardly be separated, through the work of 
Beltrami, Helmholtz, Klein, Möbius, Jordan, Schäfli, Betti, Poincaré, Brouwer, Hausdorff and others, the modern 
notion of a manifold finally arose. In dimension two, Hilbert (1902b) sketched the modern definition in terms of 
open neighbourhoods, charts, and coordinate changes (where he also had to define the fundamental notions of 
topology, a subject that at the time had by no means been brought into final form). This was subsequently formalized 
by Weyl (1913) in dimension two, and then by Veblen & Whitehead (1932) and Whitney (1936) in general. 

41 Even this twist we owe to a mathematician, namely Minkowski, cf. Corry (2004). 

# The first great example is the application of the conic sections of the ancient Greeks (first described in the fourth 
century B.C. in the context of problems in Euclidean geometry) to motion in a gravitational field, starting with 
Galilei’s parabolic motion of projectiles on earth and culminating in Newton’s derivation of Kepler’s laws describing 
the elliptic motion of planets in Principia—one of the highlights in the history of science, on a par with the discovery 
of GR. The second great example, then, is GR. The third is functional analysis, which developed out of abstract 
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1.3 Absolute differential calculus and general covariance 


To this remarkable success story one should add the computational device that made Riemannian 
geometry workable for Einstein, namely the “absolute differential calculus” developed by 
Gregorio Ricci-Curbastro (1853-1925), also simply called Ricci.“ This calculus was written 
down in final form in 1901 in a joint paper by Ricci and his student Tulio Levi-Civita (1873-— 
1941), who also became interested in GR (including a personal friendship with Einstein).** 

Of historical interest only now, this paper is very instructive as a portrait of the mathematical 
world which Einstein inhabited in the 1910s. The absolute differential calculus (or tensor 
calculus) uses formal real variables x), x2,...x,, but any kind of geometric perspective is absent: 
the calculus is a completely formal mix of algebra and analysis. Even the abstract framework of 
(multi)linear algebra is lacking (multilinear maps are written in terms of their components relative 
to a basis), so that everything is written down in terms of indices and tensors are defined (as they 
still are in some modern physics books) by their behaviour under coordinate transformations. 
The main achievement of the absolute differential calculus is the introduction of the covariant 
derivative on arbitrary tensors, along with all the rules for working with it. This gives the 
Riemann tensor, the Ricci tensor, the Ricci scalar, and many other similar constructions, studied 
from the point of view of invariant theory (as opposed to geometry). 

The next stage in Einstein’s path to GR only makes sense if we understand Einstein’s 
conflation of general covariance with a relativity principle.*° It is crucial to realize that for 
us, coordinate systems are arbitrary, physically dead /abelings of points in space-time. But for 
Einstein, coordinates were alive as physical frames of reference, in the sense that the system 


> 


(x°, x) really describes the world line (x°(r),x(r)) = (t,X) of an observer who is spatially at rest 
at X, but moves in time ż, including the stipulation that events at (t,x) and (t,) are simultaneous, 


nineteenth century analysis and turned out to be exactly the right mathematical language for quantum mechanics 
(e.g. Landsman, 2019). This phenomenon is still not well understood. Hilbert and his circle, who played a key role 
in both the second and the third example (i.e. GR and quantum theory), invoked what they called a “pre-established 
harmony between physical nature and mathematical mind” (Corry, 2004), but this seems a sledge-hammer argument 
that explains nothing. Note that the issue is not what Wigner (1960) famously called the ‘unreasonable effectiveness 
of mathematics in the natural sciences’, or, in other words, the ‘appropriateness of the language of mathematics 
for the formulation of the laws of physics’, which, he added lyrically but misleadingly, ‘we neither understand 
nor deserve’. Without in any way lessening our admiration for Newton’s genius, we perfectly well understand the 
applicability of the calculus to classical mechanics, since Newton purposely developed those in close interaction 
with each other. Our point is that the conic sections were already there, waiting for him. The miracle, if there is one, 
is the applicability of mathematical concepts that were invented purely for their own sake to physical theories like 
GR and quantum mechanics, which postdate these inventions with no apparent link or common cause. 

#See Reich (1994) for the relevant mathematical history in depth and Goodstein (2018) for (light) biography. 

‘47 evi-Civita later wrote a textbook on the absolute differential calculus including its application to GR (Levi- 
Civita, 1926; Italian original from 1923), in which he uses the concept of parallel transport he had invented himself 
in the wake of GR. This makes the 1923 book slightly more geometric than the 1901 paper, but most of the 
comments in the main text about the 1901 paper also apply to Levi-Civita’s book. Almost simultaneously, the Dutch 
mathematician Jan Arnoldus Schouten (1883-1971) published his book Schouten (1924), dedicated to Ricci, which 
is similar, to Levi-Civita’s book except that it only mentions Einstein in a footnote as someone who applied the 
theory of linear connections to physics (‘Physikalische Anwendungen gaben Weyl, Eddington, und Einstein’), and 
leaves it at that as far as GR is concerned. Schouten founded a Dutch school in tensor calculus that involved e.g. 
Dirk-Jan Struik (1894-2000), who later became a well-known (Marxist) historian of mathematics, and Max Euwe 
(1901-1981), who was originally a mathematics high-school teacher but is better known from his career in chess, in 
which he was world champion from 1935-1937. He later became one of the first Dutch computer scientists. 

4S The Lie derivative is still absent from the tensor calculus; it was introduced in 1931 by the Polish mathematician 
Władysław Slebodzinski (1884-1972), who survived Auschwitz and two other concentration camps. 

Relevant literature, none of which we literally follow, includes Norton (1989, 1993) and Janssen (2012, 2014). 
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at least for the observer moving along the given world line.*’ In order to even talk about, 
for example, the speed of light, the coordinate 7 must then be given the physical meaning of 
time, and the coordinate difference |x — y| the physical meaning of distance. For Einstein, then, 
inertial frames are described by distinguished coordinate systems, and coordinate transformations 
correspond to changes in frames of reference, which may or may not preserve inertial frames. In 
special relativity inertial frames correspond to geodesics,** and hence to do justice to the two 
ingredients of the principle of special relativity (i.e. relativity of uniform motion and constancy of 
the speed of light) it seems natural to define symmetries as transformations (i.e. diffeomorphisms) 
of space-time that map geodesics into geodesics and preserve the speed of light. This precisely 
gives the Poincaré transformations (which are Lorentz transformations combined with constant 
translations in space-time), which in turn coincide with the isometries of the Minkowski metric.*” 

Einstein’s reasoning then seems to have been as follows. The special principle of relativity 
states that the laws of physics (including constancy of the speed of light but excluding gravity) 
are the same in each inertial frame (and in no others). Hence the special principle of relativity 
is equivalent to the invariance of the laws of physics under certain special coordinate trans- 
formations, namely Poincaré transformations. Therefore, the general principle of relativity 
(which Einstein was after because he liked Mach’s principle and in special relativity disliked the 
presence of special coordinate systems—which he identified with inertial reference frames) should 
consist of the invariance of the laws of physics under general coordinate transformations.”" 

By a stroke of fortune Ricci’s absolute differential calculus gave Einstein partial differential 
equations for physics that were invariant under general coordinate transformations; this was 
even what Ricci meant by “absolute”. And this, in Einstein’s view, made all physical frames 
of reference equivalent and gave him the mathematical machinery for his “general principle of 
relativity”. The equivalence principle then implied that general relativity is only possible in the 
presence of gravity, indeed is a theory of gravity, which is then automatically generally covariant. 

The requirement of general covariance was one of the keys for Einstein in finding his field 
equations during the years 1913-1915, though not without a distraction in the form of the Hole 
Argument, as we shall see shortly. Later in his life Einstein increasingly came to believe that 
mathematics (and in particular the idea of general covariance) had been the key to his success, 
which (not even mentioning his own physical insights) already in 1915 he had described as a 
‘real triumph of the general method of the differential calculus developed by Gauss, Riemann, 
Christoffel, Ricci, and Levi-Civita. His most blatant statement in this direction is probably that: 


47Hence a reference frame should perhaps be taken to be a congruence of geodesics, rather than a single one. 

481t was Einstein who reintroduced the geometric concept of a geodesic in this context-a crucial move towards 
the current reconciliation of the tensor calculus with differential geometry—but formulated in terms of coordinates. 

®The diffeomorphisms of R* that merely preserve the geodesics of the Minkowski metric just have to preserve 
straight lines and hence correspond to affine maps, i.e., linear transformation plus translation. This is also true in 
Euclidean space, where affine transformations preserve straight lines but only isometries also preserve distances. 
In the Euclidean case the linear part of an isometry must be a rotation or a reflection, whereas in the Minkowski 
case it must be a Lorentz transformation (which notion by definition includes spatial and temporal reflections). See 
also Kobayashi & Nomizu (1963), chapter VI, for the notion of affine transformations of manifolds with an affine 
connection, such as the Levi-Civita connection, and, in that case, their relationship to isometries. 

SO We return to this issue in §1.10. For now, we just mention that Einstein’s argument is widely regarded as 
suspicious and that the correct generalization of his reasoning about special relativity would be to say that symmetries 
of a specific space-time (including a metric) are isometries (which in particular map geodesics to geodesics), whereas 
the symmetries of GR as a theory are diffeomorphisms (or, for that matter, general coordinate transformations). 
Since any kind of relativity of motion should refer to some specific space-time, it would be a category mistake 
to infer it from invariance properties of the theory as a whole. If anything, motion is relative only with respect to 
(non-trivial) isometries of a fixed space-time (if these exist), which preserve geodesics and Lorentzian distances. 


Historical introduction 


The gravitational equations could only be found by a purely formal principle (general 
covariance), that is, by trusting in the largest imaginable logical simplicity of the natural 
laws.°! (Einstein to De Broglie, 1954) 


In his Herbert Spencer Lecture in Oxford, 1933, he mused: 


Newton (...) still believed that the basic concepts and laws of his system could be derived 
from experience (...). It was the general Theory of Relativity which showed in a convincing 
way the incorrectness of this view. For this theory revealed that it was possible for us, using 
basic principles very far removed from those of Newton, to do justice to the entire range of 
the data of experience in a manner even more complete and satisfactory than was possible 
with Newton’s principles. But quite apart from the question of comparative merits, the 
fictitious character of the principles is made quite obvious by the fact that it is possible to 
exhibit two essentially different bases, each of which in its consequences leads to a large 
measure of agreement with experience. This indicates that any attempt logically to derive 
the basic concepts and laws of mechanics from the ultimate data of experience is doomed 
to failure. If then it is the case that the axiomatic basis of theoretical physics cannot be an 
inference from experience, but must be free invention, have we any right to hope that we 
shall find the correct way? Still more—does this correct approach exist at all, save in our 
imagination? Have we any right to hope that experience will guide us aright, when there are 
theories (like classical mechanics) which agree with experience to a very great extent, even 
without comprehending the subject in its depths? To this I answer with complete assurance, 
that in my opinion there is the correct path and, moreover, that it is in our power to find 
it. Our experience up to date justifies us in feeling sure that in Nature is actualized the 
ideal of mathematical simplicity. It is my conviction that pure mathematical construction 
enables us to discover the concepts and the laws connecting them which give us the key to 
the understanding of the phenomena of Nature. Experience can of course guide us in our 
choice of serviceable mathematical concepts; it cannot possibly be the source from which 
they are derived; experience of course remains the sole criterion of the serviceability of a 
mathematical construction for physics, but the truly creative principle resides in mathematics. 
In a certain sense, therefore, I hold it to be true that pure thought is competent to comprehend 
the real, as the ancients dreamed. (Einstein, 1934, pp. 166-167) 


Similarly, Hilbert, to whose role in the development of GR we will return in §1.7, saw Einstein’s 
theory as the final demise of the idea that physical theories should be based on experience: 


In Einstein’s theory we now have a conistent field theory before us; the second stage 
in the development of physics has thereby been reached. What happens is not merely 
switching off the senses, as is the case with mechanics, but rather the complete elimination 
of anthropomorphism. The conceptual structures have completely emancipated themselves 
from the usual sense impressions, and it is precisely by getting rid of these that objectivity 
in the understanding of the laws of nature as well as the unity and clarity of the theoretical 
system are achieved. In this regard, I would like to regard the general principle of relativity 
as the highest triumph of the mind over the world of appearances.’ (Hilbert, 1919/1920) 


5! Die Gravitationsgleichungen waren nur auffindbar auf Grund eines rein formalen Prinzips (allgemeine 
Kovarianz), d.h. auf Grund des Vertrauens auf die denkbar grösste logische Einfachheit der Naturgesetze.’ Quoted 
in van Dongen (2010), pp. 2-3, whose book is a major analysis of the issue at hand. See also van Dongen (2017). 

52 ‘In dieser Einsteinschen Theorie haben wir nun eine konsequente Feldtheorie vor uns; die zweite Stufe in der 
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1.4 Towards the gravitational field equations: Entwurf Theorie 


However, historical reconstruction has shown that the truth may have been quite different. 


Considerable evidence shows that Einstein did find his equations by the mathematical requirement 
of general covariance, but combined with various physical requirements he always had in mind,” 


notably the necessity of the correct Newtonian limit as well as of energy-momentum conservation. 


Specifically, after he had realized that Newton’s gravitational force (or rather its scalar 
potential @) should be replaced by the 10 components of the metric tensor guy, and through 
Grossmann had familiarized himself with the necessary mathematics, from the autumn of 1912 
onwards Einstein actively tried to involve the metric guy in generalizing Poisson’s equation 


Ag = —41Gp, (1.4) 
where G is Newton’s gravitational constant and p is the matter density. He aimed at the structure 


Quv = KTyy, (1.5) 


where Tuy is the energy-momentum tensor that had already been introduced in special relativity 
by von Laue and had been generalized to curved space-time by Kottler, K is some constant 
(which later became K = 877G), and Quy is some tensor to be constructed from the metric using 
the absolute differential calculus.”* It took Einstein three years to get to the correct expression 


Quv = Ruv — 38uvR, (1.6) 


during which he ‘dedicated himself to the problem of gravitation with superhuman effort’ .’5 


In retrospect he was almost there right from the start, since after Grossmann had pointed 
out the Riemann tensor to him in the autumn of 1912 Einstein at once tried the associated Ricci 
tensor Quy = Ryy, but this turned out to give the wrong Newtonian limit-or so he thought; in 
fact, the problem did not lie in the omission of the —}g,yR term, but with the coordinates he 
used, as well as with a misconception that would trouble Einstein for years to come, namely that 
in the Newtonian limit (and in suitable coordinates) the metric should take the diagonal form 
(1.3), where the variable speed of light c(x’,y’,z’) takes care of “everything”. In other words, 
time is curved but space remains Euclidean. As the Schwarzschild solution shows, this is wrong, 

In his search for the gravitational field equations Einstein was led by a powerful formal 
analogy with electrodynamics (whose four-dimensional formulation due to Minkowski he had 
initially been slow to endorse), whose (“‘specially” covariant) field equations take the form 


er, (1.7) 


Entwicklung der Physik ist damit erreicht. Nicht bloß eine Ausschaltung der Sinne, wie bei der Mechanistik, findet 
hier statt, sondern eine gänzliche Beseitigung des Anthropomorphismus. Die Begriffsbildungen haben sich ganz und 
gar von dem anschaulig Geläufigen emanzipiert; und gerade dadurch, daß man sich von der Anschauung losmacht, 
wird die Objektivität in der Auffassung der Naturgesetze sowie die Einheit und Übersichtlichkeit des theoretischen 
Systems erreicht. In dieser Hinsicht möchte ich das allgemeine Relativitätsprinzip als den höchsten Triumph des 
Geistes über die Erscheinungswelt ansehen. (Hilbert, 1992, p. 51). 

53See Janssen (2014). Ironically, Einstein started his Herbert Spencer Lecture with the following warning: ‘If 
you wish to learn from the theoretical physicist anything about the methods which he uses, I would give you the 
following piece of advice: Don’t listen to his words, examine his achievements.’ See also van Dongen (2010). 

54The reconstruction of Einstein’s ideas during 1912-1913 is largely based on the Zürich Notebook, which 
Einstein used from August 1912 to May 1913. A transcription may be found in Einstein (1996a) and a fascimile 
with transcription and commentary is in Renn (2007), Volume 1, pp. 313-487. The original is kept in Jerusalem. 

55 “geradezu übermenschlichen Anstrengungen, mit denen ich mich dem Gravitationsproblem gewidmet habe’, 
as he wrote on May 28, 1913, to his friend Paul Ehrenfest, quote taken from Fölsing (1993, p. 357). 
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where F is the electromagnetic field strength tensor, J = (J 0,7) is the electric (charge density, 
current), and k is a constant. This suggested to Einstein that (1.5) should also be thought of as 


where the object H I v, constructed from the metric, represents the gravitational field, and z,,, is the 
energy-momentum tensor of the gravitational field itself (whereas 7,,, is the energy-momentum 
tensor of the matter in the universe). From the point of view of (1.5), an obvious first guess for 
the left-hand side, which Einstein indeed wrote down, is (up to a constant factor) 


O° guy 


OxP ax’ m 


Quv = gh? 
where (g?°) is the inverse matrix to (gpo ) as usual, but in the context of (1.8) he started from 
Hay = —38?° Ougov, (1.10) 


which he later described as a ‘fateful prejudice’. It was only in November 1915 that he realised 
that the choice 
Hisy = sli (1.11) 


where Py are the Christoffel symbols he knew well from the absolute differential calculus, i.e., 


Fi = 58°° (dy8ou + du8ov — Io8uv); (1.12) 


gave him the best of both worlds (though not yet quite the correct field equations, see below). 
One reason for (1.10) may have been that if he adapted the Lagrangian of electrodynamics, i.e. 


L = —lFyyFYY = -In"Pn’°FuvFoo» (1.13) 
to the gravitational case by postulating 
2p ie chee (1.14) 


then the choice (1.10) led to what historians call the Entwurf Theorie (Einstein & Grossmann, 
1913). Although the field equations of this theory (whose tedious explicit form we omit), 
derived from (1.14) by the variational calculus, are not generally covariant (like the Lagrangian 
itself), Einstein nonetheless felt they were correct, since they gave both the Newtonian limit and 
energy-momentum conservation, albeit only in certain preferred coordinate systems. He wrote: 


The labor is finally ready, after endless trouble and vexing doubts.°° 


This brought Einstein in a very interesting psychological situation: since he believed his 
theory was ready despite the egregious shortcoming of not being generally covariant (and hence, 
as he thought, not satisfying the “general principle of relativity”, and hence, via his virtual 
identification of all these things, violating the equivalence principle), he started looking for 
arguments against general covariance (and, by implication, almost all his other holy principles)!°’ 


56 «Die Arbeit ist nach unendlicher Mühe und quälenden Zweifeln nun endlich fertig geworden.’ Quote from an 
undated letter to Ernst Mach, probably written during the summer of 1913, taken from Fölsing (1993). 
57See Janssen (2014) and Norton (2018) for fascinating reflections on this remarkable aspect of Einstein’s mind. 
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1.5 The Hole Argument 


Whatever its origin, one such argument has been of lasting value, namely the Hole Argu- 
ment(Lochbetrachtung).”° Before turning to Einstein’s rendition of this argument, it may be 
helpful, though anachronistic, to first give Hilbert’s (1917) reformulation of Einstein’s Hole 
Argument, which was also the first attempt to discuss the Cauchy problem for the Einstein 
equations (see §1.9). Adding insult to historical injury, we interpret general covariance of GR as 
diffeomorphism invariance, in the sense that if a metric g solves the vacuum Einstein equations 


Ruy = 0; i.e. Ric(g) = 0, (1.15) 
then also y*g solves these equations, for any diffeomorphism w. This is because 
Ric(w*g) = w*Ric(g), (1.16) 


cf. 82.5.4. That is, the Ricci tensor for the transformed metric w*g is the transform (pullback) of 
the Ricci tensor for the original metric g (this property is also the root of the Bianchi identities). If 
we now take initial data on some three-dimensional spacelike hypersurface % (it will be explained 
in $7.6 what this exactly means), and find a diffeomorphism y that is equal to the identity in 
some neighbourhood of & but is nontrivial elsewhere, and g solves the Einstein equations with 
the given initial data, then so does w*g, for the same initial data! Hence a generally covariant 
theory cannot be deterministic in the sense of being a system of partial differential equations 
with a well-posed initial value (Cauchy) problem with a unique solution (at least for short times). 

The name “Hole Argument” comes from Einstein’s original version. Turning the above story 
inside out, he takes the region where w (which in his case is a coordinate transformation) is 
nontrivial to be a (four-dimensional) “hole” in space-time, and notes that boundary conditions 
outside the hole do not determine the metric within it.”” In coordinates the argument reads as 
follows. If guy(x) solves some generally covariant equations in a coordinate system (x), then so 
does gy (x), i.e. the same metric expressed in a new coordinate system (x’), constructed from 
guv(x) by the usual transformation rules for tensors. This is the same metric. Einstein’s point is 
that giy (x) also solves the equations, even though (barring isometries), it is a different metric. 
Therefore, Einstein (correctly) concluded, the gravitational field is not uniquely fixed. 

Strangely enough, during 1915, when he returned to general covariance, Einstein conveniently 
forgot to mention his Hole Argument, returning to it only in his review Einstein (1916a), preceded 
by discussions in private correspondence from December 1915 onward. Based on the so-called 
point-coincidence argument, which claims that ‘nothing is physically real but the totality of space- 
time point coincidences’,°” he argued that (M,g) and (M, y*g) represent the same physical 
situation, i.e., in modern terminology, that diffeomorphisms are gauge symmetries.°! For the 
moment we leave it at that, and return to development of GR during 1913-1915. 


58See Stachel (2014), Norton (2019), and Pooley (2020) for surveys, and e.g. Weatherall (2018) for further 
analysis. Another argument that at the time convinced Einstein that the Entwurf Theorie was correct (and hence 
general covariance was untenable) was that energy-momentum conservation was only possible in specific coordinate 
systems, namely precisely in those where the Entwurf field equations were supposed to be valid. 

This looks unnatural compared to Hilbert’s formulation, but as Stachel (2014) remarks, Einstein was probably 
once again inspired by Mach’s principle, where “the fixed stars at infinity” determine the local intertia of matter. 

60Quoted in Stachel (2014) and elsewhere from a letter to Besso, 3 J anuary 1916, translation by Stachel. 

61 Though this answers the Hole Argument, it is not actually true in asymptotically flat space-times, where 
diffeomorphisms induce physically nontrivial transformations at infinity. Furthermore, interpreting diffeomorphisms 
as gauge symmetries leads to the problem of time, which we will return to in later in this book, cf. $8.11. 


14 


Historical introduction 


1.6 Finding the gravitational field equations: November 1915 


During these years, Einstein became increasingly dissatisfied with his Entwurf Theorie:°? 


I recognized that the field equations for gravitation I had so far were totally untenable. 


It is remarkable how quickly Einstein then collected himself, since in the dramatic month of 
November 1915 he wrote four brief papers converging to the final answer (1.6), although, as if 
they were a compressed history of the preceding years, some of these contained new mistakes. 
One reason for Einstein’s hurry in putting his thoughts into print was a competition with Hilbert 
(or at least that is what Einstein felt); we will return to Hilbert’s role in the history of GR shortly. 

The first paper (Einstein, 1915a, dated November 4) still failed to achieve general covariance, 
but at least Einstein states the intention to restore it (in a remarkably personal passage): 


On these grounds I completely lost confidence in the field equations I had established and 
searched for a way to restrict the possibilities in a natural manner. Thus I got back to the 
requirement of more generally covariant field equations, which I had left only with a heavy 
heart when I worked together with my friend Grossmann. In fact we had already then come 
very close to the solution of the problem given in what follows.°° (Einstein, 1915a, p. 778) 


Einstein recognized that (1.11) rather than (1.10), which had led to the Entwurf equations, was 
the correct choice. Putting (1.11) in the Lagrangian (1.14) then leads to the field equations 


Ruy = KTuv, (1.17) 
where the non-covariant expression Ryy = dp Ty — Peal an is “half” of the full Ricci tensor 


Ruy = OV hy — Wap tT ool Gu —T ol oy (1.18) 


©Finstein’s rejection of the Entwurf Theorie is a story by itself, but briefly: (i) it did not satisfy Mach’s principle 
as Einstein saw it (he insisted that a uniformly rotating empty Minkowski space-time should be a solution to the 
Entwurf equations, which it wasn’t-a calculation Einstein apparently did over and over again with different results 
each time); (ii) there were problems with its Lagrangian formulation; (iii) Einstein’s earlier arguments that the 
theory was unique given the correct Newtonian limit and energy-momentum conservation turned out to be flawed; 
and (iv) it got the perihelion shift of Mercury wrong (off by a factor 2.4), though Einstein’s somewhat cynical 
reactions to such discrepancies between theory and experiment are too well known to be repeated here. 

63Ich erkannte nämlich, dass meine bisherigen Feldgleichungen der Gravitation gänzlich haltlos waren!’ Letter 
to Sommerfeld, 28 November 1915 (Einstein, 1999, Doc. 153). This insight refers to October 1915. 

64Each of these papers was based on a talk Einstein gave at the Prussian Academy of Sciences on the day the 
paper is dated (in particular, he also presented his final field equations on November 25th). See Simon (2005). For 
those who wish to look at the original papers it is worth mentioning that Einstein denotes the Ricci tensor by Giz 
instead of the current Ruy (and today’s Gyy is the Einstein tensor Ruy — $8uvR), whilst his Rj, is minus our Ruy. 

65‘Aus diesen Gründen verlor ich das Vertrauen zu den von mir aufgestellten Feldgleichungen vollständig und 
suchte nach einem Wege, der die Möglichkeiten in einer natürlichen Weise einschränkte. So gelangte ich zu der 
Forderung einer allgemeineren Kovarianz der Feldgleichungen zurück, von der ich vor drei Jahren, als ich zusammen 
mit meinem Freunde Grossmann arbeitete, nur mit schwerem Herzen abgegangen war. In der Tat waren wir damals 
der im nachfolgenden gegebenen Lösung des Problems bereits ganz nahe gekommen. 

66Compared to the Einstein-Hilbert Lagrangian -gy = \/—gR, the Lagrangian -Nov4 = —g" Til bo used in 
Einstein (1915a), cf. (1.11) and (1.14), assuming g = —1, is not so far off. The first two terms in (1.18), which are 
absent in “Noy 4, merely bring a divergence and hence do not contribute to the equations of motion; this is the reason 
why the Einstein equations are second-order, although -£y contain second-order derivatives of the metric and 
hence a priori one would expect fourth-order equations. Furthermore, the third term in (1.18) vanishes if g = —1, 
cf. (1.19), so that all that survives of “ey is precisely ZNov4. The reason the equations (1.17) miss —4guvR is that 
this term arises from a variation of ,/—g in Yex, which is missing in ZNoy4 because it has been put equal to 1. 
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In unimodular coordinates, in which g = det(g) = —1, we actually have Ruy = Ruy, since 


Ou —8 = v8Thp- (1.19) 
However, it seems that Einstein recognized this fact only after he had submitted his first November 
paper. For one had to wait for the second one (Einstein, 1915b) for him to say the following: 


This tensor Ruy is the only tensor that is available for the formulation of generally covariant 
gravitational equations. If we now agree that the field equations of gravitation should be 


then we have gained generally covariant field equations.°’ 


He then justifies (1.20) by the fact that in unimodular coordinates it coincides with his earlier 
(1.17), but notes a serious problem: combining (1.17) with the unimodularity condition 


det(g) = —1 (1.21) 


enforces T = 0. At this point, under less duress he would undoubtedly have seen that the 
problem is solved by adding —}g.vR to the left-hand side (or, equivalently, —5gyyT7 to the 
right-hand side) of (1.20). But this simple solution took him another week to arrive at.°® Instead, 
he apparently felt compelled to save both his equations (1.17), which were the ones he really 
believed in, and general covariance. This combination required the unimodularity condition, 
and hence tracelessness of the energy-momentum tensor, which therefore had to be justified 
one way or the other. Such justification was available in the form of the electromagnetic world 
hypothesis, which went back to Gustav Mie (1868-1957), and also haunted Hilbert (as we shall 
see). In Einstein’s case it was rather short-lived, since his only reason for believing in it was to 
obtain a traceless energy-momentum tensor. During the next week he saw his reasoning collapse 
once again, for he noted that the unimodularity condition was incompatible with what he still 
thought was the Newtonian limit of his theory, namely (1.3). But in Einstein (1915c) he redid 
the computation of the perihelion shift of Mercury, which he had first done with Besso in June 
1913 using his Entwurf Theorie,” from his new equations (1.17), and assuming (1.21): 


Imagine my joy when I found that the equations correctly have the perihelion shift of 
Mercury (...) I was speechless from excitement for several days.’° 


The computation also opened Einstein’s eyes to the incorrectness of (1.3) in the Newtonian limit 
and at last gave him the correct picture of it. Having rescued the condition (1.21), all that was left 
was to remove its undesired consequence T = Ti = 0.”! Thus Einstein (1915d) finally wrote 


This was the end of his magnificent search for generally covariant gravitational field equations. 


67 ‘Dieser Tensor Giz ist der einzige Tensor, der für die Aufstellung allgemein kovarianter Gravitationsgleichungen 
zur Verfügung steht. Setzen wir nun fest, daß die Feldgleichungen der Gravitation lauten sollen Guy = —KTyy, so 
haben wir damit allgemein kovariante Feldgleichungen gewonnen.’ (Einstein, 1915b, p. 800). The Greek indices are 
in fact Einstein’s own notation; he freely mixed these up with Latin ones. 

68Footnote 1 in the November 18th paper (Einstein, 1915c) shows that Einstein knew the solution by then. 

69See Einstein (1996a, Doc. 14, with extensive editorial notes on pp. 344-359. 

70‘Denk Dir meine Freude beim Resultat, daß die Gleichungen die Perihel-Bewegungen Merkurs richtig liefern 
(...), Ich war einige Tage fassungslos vor Erregung.’ From a letter to Ehrenfest, January 16, 1916, cf. Fölsing (1993, 
p. 418). Fokker (1955) also reports that Einstein had told him he had got palpitations after this computation. 

Vike (1.20), the linearized form of (1.22) was already in the Ziirich Notebook; the extra term 4 guvT also 
balances a corresponding term in the gravitational energy-momentum tensor (Janssen & Renn, 2020). 
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Albert Einstein in 1916 (Credit: Museum Boerhaave, Leiden) 
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1.7 Hilbert 


Though this is not obvious from Einstein’s papers, his mathematical colleague David Hilbert 
(1862-1943) played a significant role in the development of GR.’* Contra his shallow and 
completely undeserved reputation of being a “formalist”,’ Hilbert was actually interested in 
physics throughout his career and often lectured on it. He combined this interest with his 
relentless emphasis on axiomatization, which started with his famous memoir Grundlagen der 
Geometry from 1899, in which he rewrote Euclidean geometry and heralded the modern era in 
mathematics. Exemplifying this, his Sixth Problem (from the famous list of 23 in 1900) reads: 


Mathematical Treatment of the Axioms of Physics. The investigations on the foundations 
of geometry suggest the problem: To treat in the same manner, by means of axioms, those 
physical sciences in which already today mathematics plays an important part; in the first 
rank are the theory of probabilities and mechanics. (Hilbert, 1902a). 


Hilbert’s involvement with Einstein and relativity goes back to his joint seminar during the Winter 
Semester of 1907 at Göttingen with his friend and colleague Hermann Minkowski (1864-1909), 
who incidentally had been one of Einstein’s teachers at Zürich (and did not think much of him). 
This seminar led Minkowski to his four-dimensional space-time view of special relativity, which 
after some hesitation also Einstein adopted and which of course was one of the keys to GR. 

Hilbert was also interested in Mie’s theory electromagnetic of matter from 1912, which has 
already been mentioned in connection with Einstein (1915b), and which was perhaps the first 
example of a “unified field theory”. It was, in particular, based on an action principle (i.e. a 
Lagrangian, called a “world function” at the time), an idea which fitted well with Hilbert’s notion 
of axiomatization and would play a central role in his work on gravitation to come. ’> 

In 1915 Einstein came to Göttingen to give the Wolfskehl Lectures (from June 19 to July 
7), which were devoted to general relativity and especially his Entwurf Theorie, which he still 
believed in at the time. Hilbert not only attended these lectures (as did e.g. Emmy Noether 
and Felix Klein), but he and his wife also hosted Einstein as their personal guest at home. ’® It 
seems that Einstein’s visit triggered an all-out assault on the foundations of physics by Hilbert, 
who tried to combine elements of Mie’s theory with Minkowski’s space-time view of special 
relativit and Einstein’s insights into the applicability of Riemannian geometry and the absolute 
differential calculus-all of which Hilbert was very familiar with-to the theory of gravitation. 


72 The sources for this subsection are Sauer (1999), Corry (2004) and Renn (2007), Vol. 4. The only biography of 
Hilbert is Reid (1970); a scientific biography is lacking. Rowe (2018) is a portrait of Hilbert’s circle in Göttingen. 
73 Although he did not invent it, Hilbert was a pioneer of the view that rigorous mathematical proofs should be 
purely syntactic and hence independent of the meaning of the symbols in them, as long as the rules for manipulating 
these have been stated. This came to a head in the last part of his career (1920-1930), which was devoted to Proof 
Theory (a field of mathematics he did invent). But this kind of formalization was restricted to the analysis of proofs 
and axiom systems; until the 1920s Hilbert even stated axioms informally, combining mathematical and natural 
language. Outside this specific context, mathematics was as much alive for Hilbert as it is for anybody. A decade 
after he had played his role in GR, Hilbert also initiated the (serious) mathematical study of quantum mechanics, 
culminating in von Neumann’s formalism based on Hilbert spaces; see e.g. Landsman (2022) and references therein. 
4 See e. g. Wightman (1976), Gorban (2018), and Corry (2018) for essays on Hilbert’s Sixth Problem. 
75Hilbert’s interest in variational principles went back to his work on the Dirichlet principle (Hilbert, 1904). 
Corry (2004, p. 325) notes that Einstein and Hilbert had similar unconventional political views, notably their 
belief in the fundamentally international spirit of science. Neither had signed the patriotic and vitriolic manifesto 
Aufruf an die Kulturwelt from October 1914, in which 93 leading German intellectuals wholeheartedly supported the 
German side in the First World War. Among the signatories we find physicists like Fritz Haber and Max Planck, and 
mathematicians like Felix Klein; it was even more courageous of Hilbert not to sign it than it was of Einstein, since 
the former was a German citizen whereas the latter was, at the time, Swiss (though originally German by birth). 
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This led to two papers: Hilbert (1915, 1917), of which especially the first is of historical interest. 
Precisely because of the mix of the now outdated and partly incomprehensible Mie theory with 
Einstein’s ideas (many of which have survived), Hilbert’s reasoning is hard to follow. Even 
Einstein (who was familiar with Mie’s theory and presumably also with his own) wrote: 


Why do you make it so hard for poor mortals by withholding the technique behind your 
ideas? It surely does not suffice for the thoughtful reader if, although able to verify the 
correctness of your equations, he cannot get a clear view of the overall plan of the analysis.” 
(Einstein to Hilbert, 30 May 1916) 


Furthermore, there are serious discrepancies between the first galley proofs of Hilbert (1915) and 
the published version. These proofs presumably contain the paper in the form Hilbert presented 
it on 20 November 1915 in a formal colloquium to the Göttingen Academy of Sciences, i.e. five 
days before Einstein submitted his final paper (1915d) containing the correct field equations 
(this had even been preceded by an informal talk by Hilbert on 16 November with undoubtedly 
a similar content). On top of this, in a bizarre twist of events these galley proofs suffer from a 
deletion of precisely the part that gua location might have contained the correct field equations, 
cut out by an unknown person at an unknown time. Since Einstein and Hilbert were in regular 
contact during November 1915, this has led to wild speculations to the effect that Einstein had 
taken (or even stolen) his equations from Hilbert, or even if he hadn’t, that Hilbert had at least 
scooped him and should get the credit for the invention of GR (seen as the “Einstein” equations). 
These speculations even stretched to the extent that Einstein fans had allegedly cut out the 
missing part because they contained the “Hilbert” equations—though by the same token Hilbert 
fans could have taken them out to hide the fact that they did not contain the Einstein equations.’ 

A detailed reconstruction shows that the parts missing from the galley proofs probably did 
not contain the correct field equations, or indeed any field equations, though they may well have 
contained the explicit Lagrangian ,/—gR, which may therefore correctly be called the Hilbert 
Lagrangian. Since Einstein (1916b) independently found this Lagrangian, but published it later 
than Hilbert (1915) even as published in 1916, the name Einstein—Hilbert Lagrangian is also 
correct. Having said this, knowing the correct Lagrangian, Hilbert could easily have derived the 
Einstein equations, for both the galley proofs and the published version of Hilbert (1915) show 
that he was perfectly familiar with the necessary variational techniques, and indeed all steps in 
the computation are indicated, except that the Lagrangian is left unspecified. 

It seems that until 1916 Einstein was hardly influenced by the work of Hilbert (though he 
clearly admired him), except perhaps: (i) for his brief flirt with the electromagnetic world view 
in Einstein (1915b), which he discarded as soon as he could, and (ii) by Hilbert’s competition 
speeding up his work in November 1915-since Hilbert sent him occasional updates on his work 
and invited him to at least his first talk in Göttingen on November 16, Einstein must have felt 
Hilbert breathing down his neck. On the other hand, the opposite influence is very clear from e.g. 
the differences between the galley proofs and the actual publication of Hilbert (1915). 


77 Warum machen Sie es dem armen Sterblichen so schwer, indem Sie ihm die Technik Ihres Denkens vorenthal- 
ten? Es genügt doch dem denkenden Leser nicht, wenn er zwar die Richtigkeit Ihrer Gleichungen verifizieren aber 
den Plan der ganzen Untersuchung nicht überschauen kann.’ Quoted with translation by Stachel & Renn (2007), p. 
881. This paper gives a detailed reconstruction and interpretation of Hilbert’s work on GR. 

"8 Before these galley proofs were discovered (by Corry in 2004), it was often suggested on fair grounds that 
Hilbert had priority over Einstein, since the published version of Hilbert (1915), which does contain the correct 
equations, carries a submission date of 20 November 1915. The tables initially turned when it was found that the 
galley proofs did not contain the Einstein equations, upon which the deleted section again complicated the issue. 
See Sauer (1999, 2005) and Rowe (2006) for a settlement and a review of the issue, respectively: which we follow. 


Hilbert 


These hectic events during November 1915 led to some tension between Einstein and Hilbert: 


The theory is beautiful beyond comparison. However, only one colleague has really under- 
stood it [i.e. Hilbert], and he is seeking to “partake” it (Abraham’s expression) in a clever 
way. In my personal experience I have hardly come to know the wretchedness of mankind 
better than as a result of this theory and everything connected to it.’? (Einstein to Zangger, 
26 November 1915) 


Hilbert did his best to alleviate the situation, for example by changing ‘my theory’ to ‘the theory’ 
in his galley proofs for Hilbert (1915), and by adding that the ten gravitational potentials guy 
were ‘first introduced by Einstein’. This helped, for a month later Einstein directly wrote him: 


There has been a certain ill-feeling between us, the cause of which I do not want to analyze. 
I have struggled against the feelings of bitterness attached to it, and this with complete 
success. I think of you again with unmarred friendliness and ask you to try to do the same 
with me. Objectively it is a shame when two real fellows who have extricated themselves 
somewhat from this shabby world do not afford each other mutual pleasure. 

With best regards, A. Einstein.° (Einstein to Hilbert, 20 December 1915) 


What remains of lasting value, apart from his identification of the correct Lagrangian for GR, is 
Hilbert’s recognition that the energy-momentum tensor Tuy (which Einstein had to specify as 
such, even in cases where he relied on an action principle for the gravitational sector) is simply 
the variational derivative of the matter Lagrangian; although Hilbert only discovered this for 
electromagnetism, once the point had been made its generalization to other forms of matter was 
obvious. Of course, this made possible the derivation of the complete gravitational equations 
(including matter) from a single action principle, which Hilbert had been after all along.*! 
Furthermore, Hilbert was the first to use the (contracted) Bianchi identities in GR, deriving 
them (as we shall also do) from the invariance of the Ricci scalar under coordinate transformations 
(or diffeomorphisms), and drew the (highly nontrivial) conclusion that in electrodynamics the 
vacuum Maxwell equations V FF’ = 0 follow from the coupling to gravity plus these Bianchi 
identities. See also $1.9. Finally, as many quotes (like the one opening this Introduction and 
the one ending $1.3) show, Hilbert quickly became a champion of GR, including Einstein’s 
authorship of it (sometimes even at the expense of mentioning his own contributions). Coming 
from the leading mathematician in the world at a time in which Einstein was by no means yet the 
stellar figure he would later become, this undoubtedly helped the theory (and its creator). 


1 Die Theorie ist von unvergleichlicher Schönheit. Aber nur ein Kollege hat sie wirklich verstanden und sucht sie 
auf geschickte Weise zu “nostrifizieren” (Abraham’scher Ausdruck). Ich habe in meinen persönlichen Erfahrungen 
kaum je die Jämmerlichkeit der Menschen besser kennen gelernt wie gelegentlich dieser Theorie und was damit 
zusammenhängt.’ Quoted with translation by Stachel & Renn (2007), p. 911. Heinrich Zangger (1874-1957) 
had been a friend of Einstein’s since 1906. See Corry (2004, $9.2) on the culture of “nostrification” in Hilbert’s 
Göttingen: ‘It was widely understood, among German mathematicians at least, that “nostrification” encapsulated 
the peculiar style of creating and developing scientific ideas in Göttingen, and not least because of the pervasive 
influence of Hilbert. Of course, “nostrification” should not be understood as mere plagiarism.’ (p. 419). 

80°Es ist zwischen uns eine gewisse Verstimmung gewesen, deren Ursache ich nicht analyseren will. Gegen das 
damit verbundene Gefühl der Bitterkeit habe ich gekämpft, und zwar mit vollständigem Erfolge. Ich gedenke Ihrer 
wieder in ungetrübter Freundlichkeit, und bitte Sie, dasselbe bei mir zu versuchen. Es ist objektiv schade, wenn 
zwei wirkliche Kerle, die sich aus dieser schäbiger Welt etwas herausgearbeitet haben, nicht gegensteitig zur Freude 
erreichen. Es grüsst Sie bestens, Ihr A. Einstein’ (again taken from Stachel & Renn 2007, p. 913). 

81 In this context Lorentz (1916) should be mentioned, in which Hendrik Antoon Lorentz (1853-1928) develops 
a coordinate-free version of GR, based on a geometric interpretation of the Ricci scalar in the Lagrangian (Kox, 
1988; Janssen, 1992). Unfortunately, since he did so just before the absolute differential calculus was geometrized 
by Levi-Civita’s (1917a) invention of parallel transport (Iurato, 2016), his work on GR had very little influence. 
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1.8 Weyl 


The bridge between Einstein’s work and the modern mathematical approach to GR found in this 
book (and many others, including the great texts by Hawking & Ellis and by Misner, Thorne, 
& Wheeler, both of which appeared in 1973) is not so much Hilbert, whose mathematical 
style in GR is surprisingly old-fashioned and in fact hardly different from Einstein’s, but his 
former PhD student Hermann Weyl (1885-1955). Weyl was an extraordinary broad and versatile 
mathematician, almost comparable with Hilbert himself, whose interest in physics as well as in 
the foundations of mathematics he also shared.** Wey] spent the years 1904-1913 in Göttingen, °” 
where, clearly under the spell of Hilbert,°* his early work was in functional analysis, then an 
upcoming field which was dominated at least in Germany by Hilbert’s work on integral equations. 
Weyl’s PhD thesis from 1908 was on singular integral equations and Fourier theory, after which 
his Habilitation thesis from 1910 was on Sturm-Liouville problems, seen in the context of what 
we now (since the work of von Neumann) call unbounded operators on Hilbert space.°° 

Weyl ended his Gottingen period with his famous book Die Idee der Riemannsche Fldche 
(1913), which launched the global study of Riemann surfaces and is one of the stepping stones 
towards to the modern definition of a manifold (see footnote 40). He then moved to Ziirich (as 
the successor of Geiser, the man who had introduced Einstein to differential geometry), where he 
met Einstein and evidently got interested in relativity. In 1918 Weyl published his lecture notes 
Raum - Zeit - Materie (Space - Time - Matter), from which we already quoted the preface at the 
beginning of this historical overview. Einstein himself wrote a glowing review: 


Iam always tempted to read the individual parts of this book again, because every page shows 
the amazingly steady hand of the master who has penetrated the subject matter from the 
most diverse angles. I consider it a happy occasion that such a distinguished mathematician 
has taken care of this new field. He understood how to combine mathematical rigor with 
graphic intuition. From this book, the physicist can learn the foundations of geometry and 
the theory of invariants, and the mathematician can learn those of electricity and the theory 
of gravitation. (...) One especially sees there with amazement how the most complicated 
becomes simple and self-evident under Weyl’s hand. (...) It is here that Weyl not only 
demonstrates his easy mastery of the mathematical form, but also his deep insight into what 
is essential in physics. (...) The expositions of the last paragraphs exemplify how a born 
mathematician can be effective here through simplifying and clarifying. The book will be 
invaluably helpful to everybody who wants to work in this field, not to mention the pure joy 
derived from its study.°° (Einstein, 2002, pp. 62-63) 


8? The latter interest even led to a break between them at the time when Wey! supported Brouwer’s intuitionism. 

83 See e.g. Eckes (2019). There seems to be no biography of Weyl, but see Scholz (2001) for his mathematics and 
especially Raum - Zeit - Materie, and Ryckman (2005) for his philosophy (mostly in connection with GR). 

84‘One cannot overstate the significance of the influence exerted by Hilbert’s thought and personality on all who 
came out of [the Mathematical Institute at Göttingen]’ (Corry, 2018). However, Eckes (2019) draws attention to the 
considerable influence that also Zermelo and Klein (and perhaps also Minkowski) had on the young Weyl. 

85The limit point - limit circle theorem from his Habilitation thesis is still used. 

86 Immer wieder drängt es mich dazu, die einzelnen dieses Buches von neuem durchzulesen: denn jede Seite 
zeigt die unerhört sichere Hand des Meisters, der den Gegenstand von den verschiedensten Seiten durchdrungen hat. 
Ich betrachte es als einen Glücklichen Umstand, daß ein so ausgezeichneter Mathematiker sich des neuen Gebiets 
angenommen hat. Er hat es verstanden, mathematische Strenge mit Anschaulichkeit zu verbinden. Der Physiker 
kann aus seinem Buche die Grundlagen der Geometrie und Invarianzentheorie, der Mathematiker diejenigen der 
Elektrizitätslehre and Gravitationstheorie lernen. (...) Hier sieht man ganz besonders mit Staunen, wie in Weyls 
Händen das Komplizierteste einfach und selbstverständlich wird. (...) Hier zeigt sich so recht, daß Weyl nicht 


Weyl 


Weyl’s book is remarkable in many ways, including its attractive mix of mathematics, physics, 
and philosophy,®’ but also its lyrical—if not, occasionally, outright hysterical-prose, which is 
very unusual for a mathematical physics text. For example, the fourth edition ends as follows: °® 


Whoever looks back over the ground that has been traversed (...) must be overwhelmed by a 
feeling of freedom won-the mind has cast off the fetters which have held it captive. He must 
feel transfused with the conviction that reason is not only a human, a too human, makeshift 
in the struggle for existence, but that, in spite of all the disappointments and errors, it is yet 
able to follow the intelligence which has planned the world, and that the consciousness of 
each one of us is the centre at which the One Light and Life of Truth comprehends itself in 
Phenomena. Our ears have caught a few of the fundamental chords from that harmony of 
the spheres of which Pythagoras and Kepler once dreamed.’ (Weyl, 1921, p. 284) 


Back on earth, Weyl was the first author to describe tensors in a coordinate-free manner as 
multilinear maps, even starting the technical part of his book with an axiomatic treatment of 
vector spaces.”” He then defines tensors as pointwise multilinear maps, just as we do; although in 
the spirit of the time Wey] uses coordinates as soon as he can, the abstract underpinning is clearly 
there. His most significant mathematical innovation was the idea of an affine connection (cf. our 
$3.3, where it is called a linear connection), which-though somewhat paradoxically introduced 
through old-fashioned infinitesimals-gives a covariant derivative (as well as the associate notion 
of parallel transport) independently of the metric.?! Assuming the affine connection to be torsion- 
free (for which he gives some arguments), Weyl also proves that if there is a (nondegenerate) 
metric, what we now call the Levi-Civita or metric connection is the unique affine connection 
for which parallel transport preserves length. His derivation of the Einstein equations from an 
action principle follows Hilbert, including the definition of the energy-momentum tensor as the 
variational derivative of the matter action with respect to the metric. 

However, arguably the most lasting contribution of Raum - Zeit - Materie (from the third 
edition onwards) is Weyl’s idea of (conformal) gauge symmetry, taken up in the next section. 


nur die mathematische Form spielend meistert, sondern auch mit tiefem Blick fiir das physikalische Wesentliche 
begabt ist. (...) Die Darlegungen der letzten Paragraphen zeigen, wie vereinfachend und klärend der geborene 
Mathematiker da wirken kann. Jedem, der an dem Gebiet mitarbeiten will, wird das Buch unschätzbare Dienste 
leisten, abgesehen von der reinen Freude, die er beim Studium findet.’ (Einstein, 1918b). 

87Weyl’s wife, Helene (1893-1948), whom he incidentally betrayed with Schrédinger’s wife when they were all 
in Zürich from 1921-1927 (and Weyl helped Schrödinger with the solution of the equation named after him), was a 
student of Edmund Husserl and an intellectual in her own right. Weyl himself also tended towards phenomenology. 

88There are many editions of the book, of which the first (1918) and the second (1919) are identical. The third 
(1919) and the fourth (1921) editions are major updates, especially the third, in which Weyl introduces his own idea 
of an affine connection without having a metric. The English translation from 1922 is from the fourth edition. 

89 “Wer auf den durchmessenen Weg zuriickschaut (...) muß von dem Gefühl errungener Freiheit überwältigt 
werden-ein festgefügter Käfig, in den das Denken bisher gebannt war, ist gesprengt-; er muß durchdrungen werden 
von der gewißheit, daß unsere Vernunft nicht bloß ein menschlicher, allzumenschlicher Notbehelf im Kampf 
des Daseins, sondern ungeachtet alle Trübungen and alles Irrtums doch der Weltvernunft gewachsen ist und das 
Bewußtseins eines jeden von uns der Ort, wo das Eine Licht und Leben der Wahrheit sich selbst in der Erscheinung 
ergreift. Ein paar Grundakkorde jener Harmonie der Sphären sind in unser Ohr gefallen, von der Pythagoras und 
Kepler träumten. Translation: Henry L. Brose, pp. 311-312 in Weyl (1922). 

The lack of references in $I.2 is an example of the Göttingen habit of “nostrification” (cf. footnote 79), since 
the axioms had already been given by Peano in 1888 (Moore, 1995). It is unclear whether Weyl knew Peano’s work. 

°!Compared with Levi-Civita (1917a), this makes an ambient flat space unnecessary even in the metric case. 
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1.9 Mathematical foundations of Gr: Towards the modern era 


The full history of post-1915 GR remains to be written, even on the physics side. Einstein himself 
continued to make major contributions to his theory, among which perhaps Einstein (1917b), 
the paper that launched relativistic cosmology (and introduced his cosmological constant), and 
Einstein (1918c), in which he predicted gravitational waves (directly detected almost a century 
later, on 14 September 2015), stand out.’ Moreover, the confirmation of the general relativistic 
prediction of the gravitational bending of sunlight, announced by Eddington in a session of 
the Royal Society on November 6, 1919, sanctified by J.J. Thomson in the Chair as ‘the most 
important result obtained in connection with the theory of gravitation since Newton’s day’, and 
picked up by the world press, made Einstein the celebrity that he has remained until the present. 
Nonetheless, despite the undeniable power and beauty of the theory and the increasing fame 
and prestige of its creator, GR remained at least in physics a niche field until the 1960s. It 
was immediately picked up by the leading astronomer of the day, Eddington, as well as by 
his (now almost equally famous) colleagues De Sitter, and Lemaitre, similarly by the greatest 
mathematician of his era, Hilbert, as already mentioned, followed by Levi-Civita, Weyl, and (Elie) 
Cartan in his footsteps. Even major philosophers like Cassirer, Reichenbach, and Schlick wrote 
about the implications of GR.?> However, with a few exceptions the response from the physics 
community (that Einstein himself-never a real astronomer, mathematician, or philosopher-came 
from!) was lukewarm at best.”* This attitude may have been partly due to the hostility to German 
science during and after World War I (although Einstein, while residing in Berlin, had renounced 
his German citizenship as early as 1896 and was a Swiss citizen at the time). Not coincidentally, 
Eddington and, from the other side, Hilbert were among the very few academics who were 
interested in overcoming this hostility. But it lasted for decades. On a different note, Ehlers 
(2007) writes: ‘At that time [the late 1940s] general relativity was considered a difficult and 
useless subject, admitting no interaction between theory and experiment.’ Or (Bryce) DeWitt: 


Most of you can have no idea how hostile the physics community was, in those days, to 
persons who studied general relativity. It was worse than the hostility emanating from some 
quarters today towards the string-theory community. In the mid fifties, Sam Goudsmit, then 
Editor-in-Chief of the Physical Review and Physical Review Letters, would no longer accept 
“papers on gravitation or other fundamental theory.” (DeWitt-Morette, 2011, p. 6) 


The first international conference on general relativity was only held in 1955, and its subse- 
quent revival was due to a small group of dedicated people, partly inspired by applications to 
astrophysics and cosmology, and partly for the theory’s own sake.”° This led to important GR 
communities in the United States (Bergmann, DeWitt, Schild, especially Wheeler at Princeton), 
the Soviet Union (Fock, Ivanenko, especially Zeldovich in Moscow), and Europe, e.g. in France 
(Lichnerowicz, Choquet-Bruhat), Germany (Jordan), Poland (Infeld, Trautman), Ireland (Lanc- 
zos, Synge, Schrédinger), and the United Kingdom (Bondi, Dirac, Hoyle, McCrea, Whitrow, 
Penrose).”° In particular, Dirac’s student Sciama created the GR school at Cambridge that still 
exists today and once included Hawking, Carter, Ellis, Rees, and many other leading relativists. 


°?See Jannsen & Lehner (2014), and of course The Collected Papers of Albert Einstein, from Volume 6 onwards. 

°3See Ryckman (2005) for the reception of general relativity among philosophers. 

94 An exception is Pauli (1921), which he wrote at the age of 21 at the behest of his mentor Sommerfeld. This was 
the first complete review of GR after Weyl. Another exception was Einstein’s friend Lorentz, see footnote 81. 

95See Thorne (1994), Kaiser (1998), Eisenstaedt (2006), Melia (2009), Ashtekar (2014), Blum, Lalli, & Renn 
(2015, 2016, 2020), Goenner (2017), and Lalli (2017) for personal and scholarly historical studies of this. 

°©See Robinson (2019) about King’s College London, and Lalli (2017) for smaller GR groups since the 1950s. 
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In view of the overall structure of this book, in the remainder of this section we restrict 
ourselves to a few brief comments about the transition from what was known to say Hilbert 
and Weyl around 1917, to the current formalism of mathematical GR. It would be fair to say 
that Hilbert mainly looked at GR from the point of view of PDEs, whereas Weyl had a more 
geometric view, which he combined with an emphasis on causal structure, as explained below. 
These different perspectives initially developed separately, in that the causal theory did not rely 
on the PDE theory whilst the initial PDE results were local in nature. But the two areas meet 
through the absolutely central notion of global hyperbolicity that is common to both, and in 
modern mathematical GR they are inseparable (although one still has specialists on either side). 
We first discus the PDE approach (which may indeed be a few months older than the causal one). 

Hilbert (1917) predated Hadamard (1923), in which the Cauchy problem for PDEs was 
first stated. The solution to a given PDE should: (i) exist on a given domain for all suitable 
initial and/or boundary data;”’ (ii) be uniquely determined by these data, and (iii) be stable 
against variations in these data (typically as expressed by continuity with respect to certain 
norm-topologies). This was seen as the form of determinism (or “causality”) appropriate for 
physics. In 1917 the second volume of Courant & Hilbert (1937), which gave a complete 
treatment of PDEs as the field was known at the time, was also twenty years in the future.”® 

However, in 1917 Hilbert certainly possessed massive knowledge of nineteenth century 
PDE theory, as well as of the early twentieth century interaction between PDEs and functional 
analysis, of which field he had been one of the founders. In particular, Hilbert recognized that 
Einstein’s equations were not of any standard type (i.e. hyperbolic, elliptic, or parabolic) and 
that because of what we now call “Bianchi identities” their initial value problem was ill-posed in 
the sense that reasonable initial data do not determine a unique solution; cf. $1.5. He foresaw 
what we now call geometric uniqueness, see Theorem 7.8 in §7.6, in stating that ‘physically 
meaningful’ quantities were uniquely determined, and that using suitable coordinates (namely 
geodesic normal coordinates, which he called ‘Gaussian’) also led to uniqueness in general.” 

The next important contribution to the PDE side of GR was made by Emmy Noether (1882- 
1935), whose famous article ‘Invariante Variationsprobleme’ explained the difficulties with 
Einstein’s equations that Hilbert had found in terms of the infinite-dimensional symmetries of the 
action or Lagrangian from which these equations could be derived, and introduced what are now 
called the first and second Noether theorems.!°° However, her paper is so general that, despite a 
final section commenting on Hilbert’s work, it does not contain any detailed expressions for GR. 

In that sense, it was Georges Darmois (1888-1960), who, citing neither Hilbert nor Noether, 
(co) founded the theory of the constraints of GR. Darmois (1927) recognized the equations 


Gyo =0 (1.23) 


°7For hyperbolic PDEs such as the wave equation one has initial data; for elliptic PDEs like Laplace’s equatio one 
has boundary data; and for parabolic PDEs such as the heat equation one has combinations thereof. 

98] is impossible to resist quoting a piece from Weyl’s review of this is book, which though entirely written by 
Courant clearly carried Hilbert’s spirit: “Nowadays many mathematical books do not seem to be written by living 
men who not only know, but doubt and ask and guess, who see details in their true perspective—light surrounded by 
darkness—who, endowed with a limited memory, in the twilight of questioning, discovery, and resignation, weave a 
connected pattern, imperfect but growing, and colored by infinite gradations of significance. The books of the type I 
refer to are rather like slot machines which fire at you for the price you pay a medley of axioms, definitions, lemmas, 
and theorems, and then remain numb and dead however you shake them.’ (Weyl, 1938, p. 602). 

°° See Stachel (1992) for a more detailed analysis of Hilbert’s contribution, as well as for the history of the Cauchy 
problem of GR up to the work of Choquet-Bruhat. For general PDE history, see Brezis & Browder (1998). 

100 The original source is Noether (1918). A sample of the extensive secondary literature is Kossmann-Schwarzbach 
(2011, 2020), Eggertsson (2019), and Read, Teh, & Roberts (2021). Rowe (2021) is a biography of Noether. 
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as conditions restricting the initial data. He saw that they automatically “propagate” (in holding 
everywhere provided they are satisfied at t = 0 and the other equations hold, see $7.5), and also 
gave their geometric expressions (7.148) - (7.149) in terms of the first and second fundamental 
forms of the embedded Cauchy surface where they are imposed. He also showed that in the 
wave gauge (or harmonic gauge, first used by De Donder) the remaining six Einstein equations 
were hyperbolic propagation equations for each of the components guy of the metric. Finally, he 
studied the possibility of giving initial data on null surfaces (i.e. lightcones), in which he was far 
ahead of his time. This is very impressive for someone who was actually a statistician! 

Darmois was also the thesis advisor of André Lichnerowicz (1915-1998), who worked in GR 
from 1937 until 1967 and made many contributions to the field.'°' His most important work, 
collected in Lichnerowicz (1955), includes his theorem on asymptotically flat space-times (see 
§8.4), as well as his conformal analysis of the constraint equations (see §8.6). His importance 
as an organizer of the French GR community, e.g. through organizing the Journées Relativistes 
conference series, can hardly be overestimated.'°” In that capacity Lichnerowicz was also the 
PhD advisor of Yvonne Choquet-Bruhat (born in 1923), who, during a career that spanned sixty 
years, from a four-page announcement (Foures-Bruhat, 1948) to a comprehensive 800-page 
textbook General Relativity and the Einstein Equations (Choquet-Bruhat, 2009), led the PDE 
approach to GR by giving direction and proving two of the most important results herself, namely 
the first local existence and uniqueness result (Fourés-Bruhat, 1952) and the crowning maximal 
existence and uniqueness theorem, which she proved in 1969 with Geroch. 10° 

Subsequent work on the PDE aspects of GR falls into two directions, which might be called 
hyperbolic and elliptic, depending on whether one works mainly on the evolution equations or 
on the constraint equations, respectively, or, phrased differently, on the evolution of the initial 
data or on the initial data themselves. Of course, these aspects cannot be entirely separated. 

On the hyperbolic side, one studies global properties of the above maximal (globally hy- 
perbolic) solutions, notably their extendibility (which does not contradict the formal property 
of maximality) and stability. Even the simplest case, namely the question of the stability of 
Minkowski space-time under small perturbations of its initial data, took a 500+ page book (by 
Christodoulou and Klainerman) to settle it in the positive. Despite later simplifications of this 
proof, analogous current work on the stability of black holes solutions is published in papers 
whose page count even runs over 800. Apart from stability problems, other goals of this approach 
include (dis)proving Penrose’s cosmic censorship and final state conjectures (see chapter 10). 

On the elliptic side, one highlight has been the proof of the positive mass theorem by Schoen 
& Yau (1979) and Witten (1981), to which a brief introduction will be given in §8.4.104 Many of 
the techniques used in proving uniqueness or “no hair” theorems for black holes (see §§10.9 - 
10.10) also come from the elliptic approach. Another achievement has been the development of 
gluing techniques for solutions to the Einstein equations by gluing their initial data.!°° 


101 See Lichnerowicz (1992) for a brief memoir, in which he pays special tribute to Elie Cartan (1869-1951), one 
of the founders of modern Lie theory and differential geometry, who also did important work motivated by GR, 
including the geometric reformulation of Newtonian gravity now called Newton-Cartan theory (Malament, 2012). 

102Tn this respect also the Les Houches schools founded in 1951 by Cécile Morette should be mentioned. 

103The historical survey by Ringström (2015) explains the precise regularity of the solutions in Fourés-Bruhat 
(1952) and also puts her work in a much wider mathematical perspective. Choquet-Bruhat (2014) also looks back 
on her results; see also her autobiography A Lady Mathematician in this Strange universe (Choquet-Bruhat, 2018). 

104Roughly speaking, in any asymptotically flat space-time (i.e. one in which the metric approaches the Minkowski 
metric at infinity) one can define a quantity in terms of the metric, which for the Schwarzschild solution is the mass 
of the star (or black hole), but which in general is not obviously positive. The theorem states that it is positive. 

105 See e.g. Chrusciel, Galloway, & Pollack (2010), which is actually a general survey of mathematical GR. 
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We now turn to the “causal” approach to GR. Its characteristic emphasis on the conformal 
structure of GR, i.e. the equivalence class of the metric tensor g under a rescaling 


Suv(x) > ee), (1.24) 


with A an arbitrary smooth function of space and time, originated with Weyl (1918b). Although he 
mentions the analogy with Riemann surfaces,!°° which undoubtedly drove him in this direction, 
his real argument is that what he calls Reine Infinitesimalgeometrie must go beyond Riemannian 
geometry, which (according to Weyl) suffers from the defect that parallel transport of vectors 
(through the metric or Levi-Civita connection, a concept Weyl himself had co-invented) preserves 
their length. This makes length of vectors an absolute quantity, which a ‘pure infinitesimal 
geometry’ or a theory of general relativity cannot tolerate. To remedy this, Wey] introduced the 
idea of gauge invariance, stating that the laws of nature should be invariant under the rescaling 
(1.24). To this end, he introduced what we now call a gauge field @ = @,dx" and a compensating 
transformation @y (x) > @u(x) — du À (x), and identified @ with the electromagnetic potential 
(i.e. A). Dancing to the music of time, he then proposed that the pair (g,@) describes all of 
physics. This is not the case,!°” but the idea of gauge symmetry has lasted and forms one of the 
keys to modern high-energy physics and quantum field theory: serpenditiously, although it is 
misplaced in the classical gravitational context in which Weyl proposed it, through the Standard 
Model it has ironically become a cornerstone of non-gravitational quantum physics! 

The conformal structure of a Lorentzian manifold determines the lightcones (and hence also 
their interiors), and as such Weyl was of course not the only author to discuss causal structure. 
For example, Einstein (1918c) himself wondered if gravitational wave propagate with the speed 
of light, and showed this in a linear approximation; Weyl mentions this also.'°° Furthermore, 
independently of Weyl, and in fact inspired by special rather than general relativity, Robb (1914, 
1936), Reichenbach (1924), Zeeman (1964), and also others axiomatized causal structure as a 
specific partial order. In modern notation, if M is Minkowski space-time then the simplest such 
relation is Jt C M x M, where (x,y) € J* or x < y if y lies within or on the future lightcone 
emanating from x. For general relativistic space-times this may be generalized by defining 
(x,y) € J* iff there exists a future-directed causal curve from x to y (see §5.3). 

These themes-gravitational radiation, conformal invariance, and causal order, with additional 
inspiration from some of the drawings of the Dutch artist M.C. Escher-were combined and came 
to a head in the work of Roger Penrose (born in 1931). Between 1963 and 1972, with a last 
eruption in 1979 (see chapter 10), Penrose introduced the global causal techniques and ideas 
in GR that are now central to the mathematical analysis of the subject.!°’ Moreover, in 1965 
he used these techniques to prove the first singularity theorem of GR, based on his concept of 
a trapped surface.'!° This inspired the singularity theorem of Stephen Hawking (1942-2018), 
whose Adams Prize Essay (Hawking, 1966), along with the book by Hawking & Ellis (1973) 
that arose from it, may also be counted among the founding documents of mathematical GR.!!! 


106 See page 397. Riemann surfaces may equivalently be defined as either one-dimensional complex manifolds 
or as two-dimensional Riemannian manifolds up to conformal equivalence Modestly, Wey] does not cite his own 
decisive contribution to their theory (Weyl, 1913). This equivalence also influenced Penrose’s work on GR. 

107 See Einstein’s negative reaction to Weyl (1918b) in Einstein (2002a), Doc. 8. See also Goenner (2004), §4.1.3. 

108 See e.g. page 251 of the English translation of the fourth edition of Raum - Zeit - Materie (Weyl, 1922). 

109 A key exception is the notion of global hyperbolicity, which has its roots in the work of Leray (1953) and was 
adapted to GR by Choquet-Bruhat (1967) and Geroch (1970). Penrose (1963) also worked on the PDE side. 

110See footnote 270 for references on the history of the singularity theorems. See also chapter 6. 

'l1 See Ellis (2014) for the historical context of this essay and of Hawking’s early work in general. 


26 


Historical introduction 


1.10 Epilogue: General covariance and general relativity 


In this appendix to our historical introduction we return to the theme of general covariance and 
its possible relationship to some relativity principle that generalizes the one underlying Einstein’s 
special theory of relativity. Starting with Einstein himself, this issue has naturally concerned 
many people, without a clear conclusion. But one may at least try to avoid some pitfalls.!! 
Although the field equations in Einstein (1915d) were generally covariant at last, it took 
Einstein another year to relieve himself of all coordinate conditions. Einstein (1916a) still gives 
the vacuum field equations in the form Ruy = 0 under the unimodular coordinate condition 
(1.21), and also their derivation from an action principle is the same as the one he gave in the 
previous year (Einstein, 1915a). It is only in Einstein (1916b), where he derives the generally 
covariant equations (1.22) from what we now call the Einstein-Hilbert action, that we read: 


On the other hand, in antithesis to my own most recent treatment of the subject, there is to 
be complete liberty in the choice of the system of co-ordinates.!!° 


This, then, is what Einstein meant by “general covariance”. But he also believed that generally 
covariance implies that GR satisfies a “general principe of relativity”. Returning to the recon- 
struction of his reasoning in §1.3, there is little doubt that in conflating symmetry properties of 
GR as a whole with symmetry properties of its solutions Einstein actually cornered himself: 


e Either he explains why in GR geodesic frames of reference are equivalent to arbitrary 
frames. But then the same argument (whatever it is) would apply to Minkowski space-time, 
and he loses the perfect match between the special principle of relativity and the special 
theory of relativity, on which his arguments for general relativity were predicated. 


e Or he accepts that geodesic frames of reference are preferred (i.e. “special”) and hence 
blasts the general principle of relativity even in GR. Every way you look at it you lose! 


In fact, the difference between the theories of special and general relativity cannot lie in general 
covariance. Consider, in Einstein’s own language, the following two equations for the metric: 


Rum =0, (1.26) 


or, in modern notation, Ric(g) = 0 and Riem(g) = 0, respectively, where the former is the Ricci 
tensor, the latter is the Riemann tensor, and g is a Lorentzian metric to be solved for. Eq. (1.25) 
are the vacuum Einstein equations, and let us call (1.26) the Minkowski equations. They share 
exactly the same covariance properties, but (1.25) gives GR (without matter) whereas by a basic 
result in Riemannian geometry (1.26) gives special relativity.''t More generally, almost any 
physical theory of the kind known in classical mechanics and field theory can be geometrized 
and, at the expense of adding equations like (1.26), be made generally covariant. Thus Einstein 
faces two problems in trying to relate general covariance to general relativity (1.e. of motion): 


112The history of the debate on general covariance is reviewed in Norton (1993, 1995). Further literature includes 
Anderson (1967), Friedman (1983), Norton (1989, 1999), Brown (2005), Dieks (2006), Earman (2006ab), Giulini 
(2007), Pooley (2015), Wallace (2017), and Dewar (2020). Though closely related, we do not enter the philosophical 
debate between substantivalism and relationalism, which was revisited in the light of the hole argument and general 
covariance by Earman & Norton (1987) and Butterfield (1987, 1989). See also Pooley (2017, 2020) for reviews. 

113 <A nderseits soll im Gegensatz zu meiner eigenen letzten Behandlung des Gegenstandes die Wahl des Koordi- 
natensystems vollkommen freibleiben.’ (Einstein, 1916b, p. 1111). Translation by W. Perrett and G.B. Jeffery. 

114 At least locally, see Theorem 4.1. Eq. (1.26) is equivalent to (1.25) plus the vanishing of the Wey] tensor. 
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1. GR distinguishes geodesic motion from any other and hence does contain preferred frames. 
We may then ask what kind of relativity (if not relativity of motion) GR does generalize. 


2. Almost any physical theory can be made generally covariant. Thus general covariance 
cannot be equated with general relativity and by itself must be physically empty.!!> This 
raises the question which physical property, if any, general covariance does express. 


As to the first point, we are more optimistic than the great relativist Synge, who wrote: 


The name [general theory of relativity] is repellent. Relativity? I have never been able 
to understand what that word means in this connection. I used to think that this was my 
fault, some flaw in my intelligence, but it is now apparent that nobody ever understood it, 
probably not even Einstein himself. So let it go. What is before us is Einstein’s theory of 
gravitation.'!® (Synge, 1966, p. 7) 


However, in developing his special theory Einstein used the notion of “relativity” in a way that 
does seem to survive into GR in a defensible way. In special relativity, apart from the broad-one 
would like to, but cannot say “general”!-relativity principle stating that the laws of physics are 
the same in each inertial frame, in the context of electrodynamics that actually led him to his 
theory, Einstein (1905) made the following point, with which he even starts: 


It is known that Maxwell’s electrodynamics-as usually understood at the present time-when 
applied to moving bodies, leads to asymmetries which do not appear to be inherent in the 
phenomena. Take, for example, the reciprocal electrodynamic action of a magnet and a 
conductor. The observable phenomenon here depends only on the relative motion of the 
conductor and the magnet, whereas the customary view draws a sharp distinction between 
the two cases in which either the one or the other of these bodies is in motion. For if the 
magnet is in motion and the conductor at rest, there arises in the neighbourhood of the 
magnet an electric field with a certain definite energy, producing a current at the places 
where parts of the conductor are situated. But if the magnet is stationary and the conductor 
in motion, no electric field arises in the neighbourhood of the magnet. In the conductor, 
however, we find an electromotive force (...) which gives rise (...) to electric currents of 
the same path and intensity as those produced by the electric forces in the former case.!!7 
(Einstein, 1905, p. 891) 


115 This was first pointed out to Einstein in a (now) famous paper by a high-school teacher called Kretschmann 
(1917). Einstein grudgingly conceded this point; see Norton (1993) and Giovanelli (2013, 2019) for a study of their 
debate. Kretschmann raised his concerns in the specific context of Einstein’s “point-coincidence argument’, which 
was Einstein’s answer to his earlier “hole argument” discussed in $1.5: which had led him to temporarily abandon 
general covariance, almost blocking his way to GR. Thus Kretschmann argued that any physical theory whose 
empirical content lies solely in point-coincidences (as Einstein had it) can be written in generally covariant form. 

116This quotation has been borrowed from Norton (1995). John Lighton Synge (1897-1995) wrote powerfully and 
beautifully. The entire preface of his book (Synge, 1966) would be worth quoting, or at least the brilliant first page. 

117 Daß die Elektrodynamik Maxwells - wie dieselbe gegenwärtig aufgefaBt zu werden pflegt-in ihrer Anwendung 
auf bewegte Körper zu Asymmetrien führt, welche den Phänomenen nicht anzuhaften scheinen, ist bekannt. 
Man denke z. B. an die elektrodynamische Wechselwirkung zwischen einem Magneten und einem Leiter. Das 
beobachtbare Phänomen hängt hier nur ab von der Relativbewegung von Leiter und Magnet, während nach der 
üblichen Auffassung die beiden Fälle, daß der eine oder der andere dieser Körper der bewegte sei, streng voneinander 
zu trennen sind. Bewegt sich nämlich der Magnet und ruht der Leiter, so entsteht in der Umgebung des Magneten 
ein elektrisches Feld von gewissem Energiewerte, welches an den Orten, wo sich Teile des Leiters befinden, einen 
Strom erzeugt. Ruht aber der Magnet und bewegt sich der Leiter, so entsteht in der Umgebung des Magneten kein 
elektrisches Feld, dagegen im Leiter eine elektromotorische Kraft (...) die aber (...) zu elektrischen Stromen von 
derselben Größe und demselben Verlaufe Veranlassung gibt, wie im ersten Falle die elektrischen Kräfte. Translation 
by W. Perrett and G.B. Jeffery (Einstein er al., 1923). In connection with GR see also Janssen (2012, 2014). 
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In modern language, the separation of the electromagnetic field Fuy into an electric part Fo; and 
a magnetic part F;; depends on the observer. Similarly, if in GR one identifies the metric guy 
with the frame-independent “inertio-gravitational potential” and the Christoffel symbols ey 
with the actual gravitational field, then a freely falling person locally puts Toy to zero and hence 
feels no gravitational field, whereas a stationary observer has non-vanishing Christoffel symbols 
and hence attributes the observed motion to gravity. From this point of view, the non-tensorial 
character of the Christoffel symbols is a coordinate-dependent blessing in disguise! 

Another concept that perhaps GR relativizes more generally than special relativity does is 
simultaneity. More than anything else, what makes special relativity a genuine challenge to our 
world view is that the now has become (inter)subjective: for an observer at rest in the usual (t,x) 
coordinates the planes of simultaneity are horizontal, whereas for a (relatively) moving observer 
they are tilted—although they are still planes (this explains phenomena like length contraction 
and time dilation). As we shall see in chapter 8, in the 3 + 1 split of GR there is no preference for 
a foliation of space-time by “horizontal” or “flat” planes; essentially any choice of hypersurfaces 
of simultaneity, typically curved, is allowed. See also §8.11. 

The second question remains. Conceding that any physical theory could indeed be brought 
into generally covariant form using the absolute differential calculus, Einstein’s own answer 
was that only some theories (including, of course, GR) are ‘simple and transparent’ in generally 
covariant form.!!* With due respect,!!° this is balderdash. First, eq. (1.26) is as simple and 
transparent as (1.25), or even simpler, since (1.25) is a contraction of (1.26). Also, the example 
of Newtonian gravity that Einstein gave would soon be reformulated in geometric fashion by 
Cartan, resulting in a theory as simple and transparent as GR. Second, criteria like simplicity, 
transparency, and beauty are subjective, time-dependent, and relative to one’s (mathematical) 
education. In 1900 only a few mathematicians and physicists were familiar with linear algebra, 
but now this is a first-year subject which most students find simpler than, say, analysis. !7° 

Another answer is that GR “is a geometric theory”. But this is not a very good answer either, 
since special relativity is as geometric (and as covariant) as GR, as we have already seen. In fact, 
Einstein himself was not very impressed by this argument at all, pointing out that any theory 
containing vectors (which would mean practically all of physics) could be called “geometric”. !7! 

A third answer would be that general covariance expresses the fact that GR “lacks absolute 
objects”.!”” But once again, formulated like (1.26), so does special relativity, at least in writing 
down its equations. Perhaps it ends up with an absolute object, namely the Minkowski metric, 
but then again, any specific solution to (1.25) is also an absolute object in the same sense.!7° 


'I8See Einstein’s (1918a) answer to Kretschmann (1917) as well as his 1954 letter to De Broglie (cf. $1.3). 

'9Rinstein’s (later) views in this respect, for which Norton (2000) presents a historical analysis, 
were similar to Dirac’s, see e.g. the book chapter ‘Mathematical Beauty’ by his biographer H. Kragh, 
https: //simplydirac.pressbooks.com/chapter/mathematical-beauty/. See also Hossenfelder (2018). 

!20This objection also applies to various refinements of Einstein’s point that are discussed by Norton (1993, 1995). 
For example, Bergmann (1942) argued that GR stands out because, starting from a generally covariant reformulation, 
the structure of other theories (like special relativity and Newtonian gravity) simplifies if their covariance group 
is reduced. Here again one wonders what “simplicity” means: if it means “less structure”, then special relativity 
would be simpler in its generally covariant form, whose equations assume just a metric, rather than a specific one. 

!21 See Lehmkuhl (2014). 

122 The idea is that the symmetry group is the largest one preserving all “absolute objects”. This works for special 
relativity if (only) the Minkowski metric is seen as absolute, and it works for GR if nothing is deemed absolute. 

!23The difference between GR and generally covariant special relativity is that the latter is categorical, in-barring 
the topology of space-time-having only one solution, up to isomorphism, much as Hilbert’s (1900) axioms for the 
real numbers (as a complete totally ordered field) are categorical, at least if they are expressed in second-order logic 
(Shapiro, 1991). Whatever its implications, categoricity does not hold for e.g. Newton—Cartan gravity. 
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Moreover, Minkowski space-time is less absolute than some generic solution to the Einstein 
equations in the sense that the former has a large isometry group (viz. the Poincaré group, whose 
dimension is the maximal one an isometry group can have), whereas the isometry group of 
generic Lorentzian metrics is trivial. This also make GR less “covariant” than special relativity. 

Against this, one may argue that in GR the metric is, after all, less absolute than in special 


relativity because in the full Einstein equations (1.22). it is coupled to the matter distribution. 


The point, then, is that there is no such coupling for (1.26), and this is supposed to make GR 
superior to special relativity because GR has no objects “that act but are not acted on”. This 
fact is hidden by the vacuum field equation (1.25) and hence the argument rests on a distinction 


between the vacuum Einstein equations and those with matter. But this distinction is artificial. 


Furthermore, where does the buck stop? At least in its modern formulation GR uses smooth 
manifolds, which are modeled on R* with the usual smooth structure and topology.'”* These 


are not dynamically generated but assumed, and hence should be counted as “absolute objects”. 


Hence the argument, once it is carried through consistently, ultimately turns against GR, too. 
The last argument we discuss is that of the particle physicist: much as the gauge invariance 
of electrodynamics expresses the fact that at the quantum level this theory describes interacting 
massless particles with helicity +1, i.e., photons, the general covariance of GR leads to massless 
particles with helicity +2, i.e., gravitons (see §8.5). This is arguably the strongest and physically 
most compelling argument for general covariance, but in the absence of even a perturbative 
theory of quantum gravity it is still feeble, not to speak of the wide gap between this kind of 


reasoning and the geometric structure of GR that leads to its general covariance in the first place. 


Our conclusion is that while general covariance does not express a general relativity principle, 
it remains mysterious what it does express. Any resolution will have to navigate between: 


e The pull towards physical relevance of general covariance in being a symmetry of the field 
equations of GR, whose ingredients (curvature and energy-momentum) are physical. !?5 


e The pull against physical relevance, since no one has been able to figure it out so far. 


At least in physical and mathematical practice, the second force has won: in its modern form 
of diffeomorphism invariance of Einstein’s equations, general covariance is seen as a gauge 
symmetry, in the following sense.!7° Let M; be a 4d manifold and gi a Lorentzian metric on M; 
solving the vacuum Einstein equations. Then two pairs (Mj, g1) and (M2, 82) that differ by an 
isometry (i.e. a diffeomorphism that preserves the metric) describe the same space-time. !7/ 
This resolves the Hole Argument (at least in Hilbert’s version, see Theorem 7.10) and 


also justifies the widely spread identification of active and passive coordinate transformations. 


Nonetheless, we will return to this discussion in the context of the “problem of time”, which we 
regard as the “Hamiltonian shadow” of the problem of general covariance (see $8.11). 


'24F ven with the usual topology there are innumerable inequivalent smooth structures on R4, each giving rise to a 
different concept of a smooth manifold. This is a result from Donaldson theory (Donaldson & Kronheimer, 1997). 

125 See e.g. Brading & Castellani (2003), Belot (2013), Caulton (2015), and Dewar (2019) for discussions of the 
concept of symmetry in physics. This concept is very tricky, as the debate on general covariance shows! Here we 
just call general coordinate transformations (or, for that matter, diffeomorphisms) symmetries because they are 
transformations that preserve solutions to the Einstein equations, cf. §1.5. Defining symmetries as transformations 
that preserve solutions is a ‘recipe for disaster’ (Belot, 2013, §3), but we use the idea the other way round. 

126The gauge group depends on the details. In Einstein’s GR it is a diffeomorphism group, but in other versions of 
GR it may consist of local Lorentz or Poincaré transformations (Blagojević & Hehl, 2013; Krasnov, 2020). 

127But what does this mean? Suppose two stationary black holes have the same parameters and hence are both 
described by exactly the same Kerr metric, are they really “the same space-time”? The mind boggles! 
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2 General differential geometry 
The mathematical language of GR is differential geometry, enriched by geometric analysis. 


2.1 Manifolds 


We start by reviewing the key definitions underlying the concept of a manifold.'?® First, a space 
means a topological space, assumed Hausdorff. The topology of M (i.e. the set of its open sets) 
is denoted by O(M), so that U € O(M) means that U C M and U is open. Since otherwise it 
cannot support a Lorentzian metric, in GR we may assume that M is also metrizable.'”” If M is a 
(topological) manifold, this is equivalent to M being second countable as well as paracompact. 


Definition 2.1 7. An n-dimensional (topological) manifold is a space M such that any x E€ M 


has a nbhd (= neighbourhood) U € O(M) that is homeomorphic to some V € G(R"). 


Equivalently, one may require V to be R” itself, or some open ball in IR". 


2. A chart on M is a pair (U,@) where U € O(M) and 9: U + R” is a homeomorphism 
onto its image V = @(U). A chart (U,@) gives a coordinate system on U, in that the 
coordinates (x!,...,x”) of x € U of x are x = p!(x), where one writes ọ : U > R” as 
(o!,...,0"), where ° : U > R in terms of the standard basis of IR" (i= 1,...,n). 


3. AC*-atlas on M (where k € N U{o}) is a collection of charts (Ua, Qa), where M = UgU a 
(i.e. the Ug form an open cover of M), and, whenever Ugg = Ua Ug is not empty, writing 


Vag = Pa(Uag), the map Pg © (eee Vag > R” is CF (since Vag C R” this is well defined). 


4. Two C*-atlases (Ua, a) and (U/,,P,,) on a topological manifold M are equivalent 
if their union is a C*-atlas, i.e., if all transition functions Pp oa! and Pg © (Py)! 
(if defined) are C*; this is indeed an equivalence relation. A C*-structure on M is an 
equivalence class of C* atlases on M. A C*-manifold is a manifold with a C* structure. 
A smooth manifold is a manifold with a C” structure, that is, a C”-manifold. 


5. A function f : M — Rona smooth manifold is smooth, written f € C” (M), if for some 
fixed atlas (within its equivalence class), each map fo Qx 1: Va > Ris smooth. '*° 


6. For two smooth manifolds M,N, a map y : M —> N is smooth if for each f € C” (N) 
the pullback y* f = f o y is inC”(M). Equivalently, in terms of the manifolds: for any 
chart (U,@) on M and chart (Ü,®) on N such that U = w(U) QU #9, the function 
oyog! :V'— Ý is smooth (in the calculus sense), where V' = g(yw !(U')) CV. 


7. A diffeomorphism of M is an invertible smooth map y : M — M with smooth inverse. 
Under the obvious operations, such maps form the diffeomorphism group Diff(M) of M. 


Unless the contrary is stated, we henceforth assume that M is a smooth manifold equipped with 
some C™ atlas (Ua, Pa); and that all maps between smooth mathematical objects are smooth. 


!28See §2.6 for manifolds with boundary. References for this chapter are Choquet-Bruhat & DeWitt-Morette 
(1982), Abraham & Marsden (1985), Kriele (1999), Frankel (2004), Lee (2012), and Märcut (2016). 

129See e.g. Palomo & Romero (2006), §1.1, or Minguzzi (2019), §1.8. 

130 This is then true for any atlas. Conversely, M as a manifold can be reconstructed from C” (M) as a commutative 
algebra via homomorphisms ev : C*(M) — R. See e.g. Navarro González & Sancho de Salas (2003), chapter 2. 
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2.2 Tangent bundle 


The differential geometry relevant to GR comes from the tangent bundle, which generates the 
entire tensor calculus. Of the many roads to this bundle we prefer an (initially) algebraic 
construction in terms of derivations on C° (M), from which the geometric picture emerges. !*! 
Readers who are mainly interested in using tangent bundles can move straight to Definition 2.4. 


Definition 2.2 1. A derivation of an algebra A (over R) is a linear map 6: A — A satisfying 
ö(ab) = ô(a)b +aô(b). (2.1) 
2. For any smooth manifold M, a point derivation at x € M is a linear map 
ô :C”(M) > R (2.2) 
that satisfies the Leibniz rule 


(8) = NER) + f(x) LT). (2.3) 


In no. 2, A = C” (M) is seen as a (commutative) algebra with respect to pointwise operations. 

The set Der(A) of all derivations of A is a vector space (again over R). If A is associative and 
commutative, as is the case for A = C”(M), then Der(A) is also an A-module under the natural 
action (ad )(b) = ad(b). In addition, Der(A) is a Lie algebra under the bracket 


[61,6] = ôl o — & 06). (2.4) 
For M = R”, taking X' = ö(x') it follows that each derivation ö of C*(IR”) assumes the form 


BNNs) = xg > 25) 


where X € C”(R”, R”), henceforth called X(R”), is an “old-fashioned vector field” on R”, i.e. 
a field of arrows. Conversely, X defines a derivation 6 = dy by reading (2.5) as a definition of 
ô. This gives a bijection X + dy between the set X(IR”) of all vector fields on R” and the set 
Der(C”(R”)) of all derivations on C*(IR"). We further pass to point derivations by defining 


(S) := SLR), (2.6) 


where ô € Der(C”(R”)). Conversely, Definition 2.2 implies that a family of point derivations 

x + 6,, defined for all x € R”, comes from a single derivation 6 via (2.6), and hence from a 

vector field X via 6 = ôy, iff the map x — 6,(f) is smooth from R” to R for each f € C” (R”). 
Eq. (2.4) also has a match for vector fields: X(IR”) is a Lie algebra under the commutator 


[X.¥](f) =X) -Y(X(f)). (2.7) 


131 An algebra A (here always defined over R) is a real vector space equipped with a bilinear map A x A > A, 
usually written (a,b) ++ ab. Many algebras are associative in that (ab)c = a(bc) for all a,b,c € A, as well as 
commutative, i.e. ab = ba for all a,b € A. Lie algebras are neither: here one writes (a,b) ++ [a,b], with axioms 
[a,b] = —[b, a] as well as the Jacobi identity |a, [b,c]] + [c, [a,b]] + [b, [c,a]] = 0. A module over an algebra A is a 
vector space V with a bilinear map A x V — V, written (a,v) — av, such that a(bv) = (ab)v (or a(bv) = [a,b]v). 
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In coordinates, where we use components X = ¥;XŻð; and Y = £ jY Jd i, we have 


[X,Y] =) [X,Y]; [X,Y] =) (X/0,¥' — Y/0;X"). (2.8) 


l J 


Relative to (2.7) and (2.4), the bijection X + dy is promoted to an isomorphism of Lie algebras. 
Finally, if ¥(R”) carries the C” (R”) action given by (fX )/(x) = f(x)X/(x), then X © öx is 
in addition an isomorphism of C° (R”) modules. Since X : R” — R” is given by its components 
X* : R” > R, as a C” (R”) module X(IR") decomposes as a direct sum of copies of A = C” (R”). 
By definition, this makes X(R”) a free module over C” (R"). Of course, the same is then true 
for Der(C*(IR")). In sum, looking at a vector field X as the corresponding derivation dy, we 
often identify Der(C” (R”)) with X(IR”), and this identification preserves all relevant structure. 
We now generalize this story to arbitrary manifolds M. On the algebraic side, we have the 
derivations Der(C*(M)). We are going to define vector fields geometrically as sections of the 
tangent bundle TM to M, whose construction is best understood in a more general form. 


Definition 2.3 A (real, locally trivial) k-dimensional vector bundle over M is an open surjective 
map 1: E — M, where E is a manifold, such that: 


1. For each x € M, the fiber Ey := T~! (x) is a k-dimensional (real) vector space, i.e. Ey = R* 
(where k is independent of x). This is the main point. More technically: 


2. M has an open cover (U;) with diffeomorphisms ®; : m~'(U;) — U; x IR‘ such that: 


(a) Each restriction ®; : Ey > {x} x RÝ is an isomorphism of vector spaces (x € Uj); 
rp 


(b) If Vij =U;NU; FO, then Dij = Bio P7! : Ujj x R > Uj; x R* is the identity on 
the first coordinate and a vector space isomorphism on the second one. 


A vector bundle map from 1 : E > M to m: F >N is a pair of maps of : E — F and 
Pp : M — N such that T2 0 Pf = P, © T1, and each “fiber” map Ọp : Ex — Fọ,(x) is linear. 


The simplest k-dimensional vector bundle over M is E = M x R* with z given by projection 
on the first coordinate; this is called a trivial bundle. A (cross-)section of E is a map s : M —> E 
such that 7 os = idy, i.e., m(s(x)) = x for each x € M. The set of smooth sections of E is 
denoted by I (E£) or I(M,E). This is a vector space. Under the natural action 


C”(M) x T(E) > T(E); er; (2.9) 


the vector space T(E) is a finitely generated projective (f.g.p.) module over C*(M).!°? 
Sections s of the trivial bundle E = M x IR‘ — M bijectively correspond to maps 5: M — R* 


via s(x) = (x,5(x)). Hence we obtain, as an isomorphism of f.g.p. C*(M)-modules, 
T(M x R*‘) =c™(M,R*). (2.10) 


The Serre-Swan Theorem provides an isomorphism between f.g.p. modules & over C*(M) and 
vector bundles E — M over M, in such a way that £ ST (E). We first define E as a set by 


E := LUyemEr: E; = E /~x= E/ (CŒ (M) -£). (2.11) 


132 An A-module @ is called finitely generated projective if there exists an A-module .7 such that &® F is free, 
i.e. isomorphic to a finite direct sum ®*A. Equivalently, £ S p(@*A) for some idempotent p € M; (A) (i.e. p? = p). 
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Le. 51 ~x 52 iff s1 — s2 € CP (M) - &, defined as the linear span in & of all fs, where s € & and 
fEC(M) = {f ©C°(M) | f(x) =O}. (2.12) 
Then each fiber E, of E is a vector space under the linear structure inherited from £, that is, 
A [sila + uls2])x = [As + usa]; 0:= [0],, (2.13) 
where [s], is the equivalence class of s with respect to ~y, and A,u € IR. Subsequently, define 
& T(E); Ss; s(x) = [Sle (2.14) 


so that s € E, and hence s : M — E is a cross-section of E. Then there is a unique smooth structure 
on E such that (2.14) is an isomorphism of C” (M) modules. This isomorphism maps C? (M) -& 
to T(E; x) := {s € T(E) | s(x) = 0}, so that the mirror of (2.11) under the isomorphism (2.14) 
is 

T(E)/T(E:x) © Ey. (2.15) 


We apply this to the C*(M)-module & = Der(C”(M)), and notice that we have an isomorphism 
Der(C”(M))/ ~x > Der.(C*(M)); [öl + ôv (2.16) 


where Der,(C°(M)) is the vector space of all point derivations 6, of M, cf. (2.2) - (2.3). Although 
Der(C”(M)) may no longer be free (as in M = IR"), using charts one can show that it is finitely 
generated projective, so that the above procedure for defining a vector bundle E is applicable. 


Definition 2.4 The tangent bundle 7: TM — M is the vector bundle E constructed from 
& =Der(C”(M)) (2.17) 
as in the above procedure, replacing (2.11) by (2.16). That is, the total space and fibers are 
TM := UxemTxM; T,M := Der,(C*(M)), (2.18) 
and the smooth structure of TM is (uniquely) defined by the property that the map 
Der(C”(M)) > &(M) :=T (TM); d+ (x ô), (2.19) 
where 6, is defined by (2.6), is an isomorphism. A vector field on M is a cross-section of TM. 


In a local chart  : U — R”, for x € U we define the symbol d; as an element of T,M by 


fe Feo) 9(s)), (2.20) 


where f € C*(U) and ọ7! is the inverse of ọ : U > V = g(U). With V c R”, the function 


fo po! :V —> R is the coordinate expression f (tt. oe a) of f, so that 0; in (2.20) may be taken 
literally. This also shows that (0;,...,0,) is a basis of T,M, so that we may expand X, € T,M as 


X=) Xid; Xi = xo'(x), (2.21) 
i=l 


where 9 = (qg!,...,@”) : U — R”. Thus TM is an n-dimensional vector bundle over M. 
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In conclusion, a vector field on M, written X € X(M), is a map x + Xy, also written as 
x ++ X(x), where x € M and X, € T,M as defined by (2.18). A derivation on M is a map 
ö:C”(M) — C” (M) that satisfies (2.1). These concepts are related by (2.6) with 6, = Xx. We 
think of a vector field X € X(M) as the collection of all “tangent vectors” X, € TM, whereas we 
think of the corresponding derivation ô as a single global operation on C*(M). 


e Point derivations push forward under maps y: M — N: for x € M we have linear maps 
TY = Y: TM > Tyo N; (wö,)(g) = &ly*g) (g € C(N)), (2.22) 
where y*g := go wis the pullback of g. Collecting these maps gives a vector bundle map 


Tw=Ww=w:TM STN. (2.23) 


e However, derivations (or vector fields) push forward only if y : M —> N is a diffeomorphism: 
the map y, : Der(C”(M)) — Der(C” (N) ), or yY, : X(M) > X(N), is given by 


v.(8) = (y) o 8oy". (2.24) 
One needs (y~!)* even if N = M, since 5 o y* fails to be a derivation of C” (M). Check! 


So far, tangent vectors X, € TyM were defined algebraically as point derivations, i.e. as linear 
maps ôy : C” (M) — R satisfying (2.3). Geometrically, each tangent vector (nomen est omen!) 
is tangent to some curve y through x, i.e., a map y: 1 — M, where J C R is some interval we 
always assume to contain 0, such that y(0) = x (see below for the existence of y). In other words, 


KH) = SFM) mo 2.25) 


which symbolically may be written as X; = ý = dy/dt, or even as X, = d/dt, with y understood. 


This description gives a geometric perspective on the pushforward of TyM just described: 
e If X =dy/dt is tangent to y, then WX = d(woy) /dt is tangent to y (y). 


In a chart @ = (q!,...,9") : U — R”, with x € U, the components X of X, are given by 


i i d i d 
X = Xp) = Zen) = SF Ono (2.26) 


where y (t) = '(y(t)). This also shows that y exists, given X,, since it just has to satisfy (2.26). 


Of course, y is far from unique. Eq. (2.26) gives the traditional transformation rule for vectors 
under a change of charts (i.e. of coordinates). If x € Ug Ug, then (2.25) and (2.26) imply 


x=) By, (2.27) 


where Xp = Ko, etc., and each coordinate xg = f(x) of x with respect to @g is seen as a function 


of all coordinates x}, = @q(x) of x with respect to @q, via the identity Ps = Ps o z! © Qa» ie. 


x (Xa) = Ps o Qa! (xa). (2.28) 
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It is important to distinguish (2.27), which is a change of coordinates formula for a given 
tangent vector, from the pushforward of a tangent vector under a map y : M — M. With 
P:U-V CR’, suppose for simplicity that x € U and also y(x) € U. Then, writing Xi = X' 
as above, as well as yw’ = g o ywo@! (which near x is a function from V to IR), we have 


l ðyi 
(wx =F Yxi. (2.29) 
7 dx 
A curve y: I — M integrates a vector field X if Xy) = dy(t) /dt for all t € I, i.e., in coordinates, 
dy! (t 
4 Se.) (2.30) 


The theory of ODEs shows that for each x € M there exists an open interval J C R (with 0 € J) 
and a curve Y:1— M on which (2.30) holds with y(0) = x. This solution is unique in the sense 
that if yı:/ı > M and % : h —M both satisfy (2.30) with y (0) = %(0) = xo, then y = % on 
I, Nh. Taking unions, it follows that there exists a maximal interval J on which y is defined. 

If for any x € M there is a curve y: IR > M satisfying (2.30) with y(0) = x, we say that 
X € X(M) is complete.'”” In that case, all integrating curves y can be assembled into the flow 
of X. This is a smooth map y: R x M — M, written y, (x) = y(t,x), that satisfies 


y(x) =x; (2.31) 
d 
Xy of = Gls) (2.32) 


for all x € M, t € IR, and f € C*(M). Thus the flow y of X gives “the” integral curve y of X 
through xo by y(t) = y;(xo). Any complete vector field has a unique flow. Uniqueness implies 
that M is a disjoint union of the integral curves of X (which can never cross each other because 
of the uniqueness of the solution), and also implies the composition rule 


Ws OW = Ws. (2.33) 


From a group-theoretic point of view, a flow is therefore an action of R (as an additive group) on 
M that in addition integrates X in the sense of (2.32). In particular, (2.33) implies w_, = w; /, 
so that each w : M — M is automatically a diffeomorphism of M. 

If X is not complete (a case that will be of great interest to GR!), we first define the domain 
Dx C R xM of was the set of all (t,x) € Rx M for which there exists an open interval J CR 
containing 0 and f, as well as a (necessarily unique) curve Y:/— M that satisfies (2.30) with 
initial condition y(0) = x. Obviously {0} x M C Dx, and (less trivially) it turns out that Dy is 
open. Then a flow of X is a map yw: Dy — M that satisfies (2.31) for all x and (2.32) for all 
(t,x) € Dx. Eq. (2.33) then holds if the left-hand side (and hence the right-hand side) is defined. 

As a first application of flows, let us define the Lie derivative ZxY of some vector field 
Y € X(M) with respect to another vector field X € X(M) by 
Ya vu) . Waya) -Y 


= Jim lim 
t0 t t—0 t 


where y is the flow of X. Note that Yy, (x) — Yx would be undefined, since Yy, (x) € Ty, (œ) M whilst 
Y, € T,M and these are different vector spaces; the pushforward w/ serves to move Yx to TyM. 
A simple computation then yields the extremely useful result 

LY = [X,Y]. (2.35) 


133]f X has compact support, then it is complete. So if M is compact, then every vector field on M is complete. 


LxY (x) 


(2.34) 
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2.3 Dual vector spaces, metrics, and tensor products 


In order to defined tensors we need some linear algebra. Let V be a finite-dimensional real 
vector space, with dim(V) = n, which in GR will be V = T,M. The dual V* = Hom(V,R) 
consists of all linear maps from V to R. This is a real vector space in its own right under 
pointwise constructions. It is isomorphic to V (as a vector space), but not canonically so: one 
needs to specify a basis (e1,...,en) of V, with corresponding dual basis (@!,...,@”) defined 
by œ" (ep) = ô$, upon which the ugly map J, ve ++ Lv“ @“ from V to V* is an isomorphism 
(which obviously depends on the chosen basis). However, we do have a canonical isomorphism 


vey yoo; (0) = 6(v) (2.36) 


where ve V** = Hom(V*, R). This map is injective for any V, but it is surjective (and hence an 
isomorphism) iff V is finite-dimensional. One often writes (0,v) for both O(v) and #(@). 
The naturality of the isomorphism V* = V improves markedly in the presence of a metric. 


Definition 2.5 A metric g on V is a bilinear map g : V x V > R that is: 
e symmetric, in that g(v,w) = g(w,v) for all v,w E€ V; 
e nondegenerate, i.e. for each nonzero vector v € V there is w € V such that g(v,w) £0. 


A metric g yields two maps that are mutually inverse and hence are isomorphisms V* = V: 


b:V—V*, b(v) =v; v’(w) := g(v,w); (2.37) 

t:V* >V, H(0)= 4; g(O,v) := O(v). (2.38) 

Any metric g can be diagonalized, i.e. V has an orthonormal basis (ea) = (eı,...,€n), in which 
g(ea,€p) = EaÔab; €, = 1. (2.39) 


The pair (n_,n+), where n_/n4 is the number of negative/positive numbers £4, is independent 
of the basis and hence is an intrinsic property of a metric g, called its signature. Especially in 
relativity, the signature is often written as (—---— ++), with n_ /n+ minus/plus signs. 

We now turn to the tensor product. In the following proposition, V and W are real but not 
necessarily finite-dimensional (and the same construction works over any field, typically C). 


Proposition 2.6 Let V and W be real vector spaces. There is a real vector space called V ®W, 
in words the tensor product of V and W (over R), and a map 


p:VxW-V®W; p(v,w) =v®w, (2.40) 


such that for any vector space X and any bilinear map P : V xW — X, there is a unique linear 
map B':V W — X such that B = B'o p. In other words, the following diagram commutes: 


Vxw = V&W 
ma (2.41) 
s E 
X 


Moreover, this so-called universal property implies that V®W is unique up to isomorphism. 
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We will not prove this here in general, but do show existence of V & W if V and W are finite- 
dimensional.!>* We first assume that V = Y* and W = Z*, in which case we define 


Y* @Z* :=Hom(Y x Z,R); (2.42) 
(o @p)(y,z) = o(y)p (z), (2.43) 


where Hom(Y x Z, R) is the space of bilinear maps from Y x Z to R, and of course o € Y*, 
p € Z*,y€Y,z €Z. Then B'(o @p) = B(o,p) by construction, and this uniquely extends to a 
linear map B’ : Hom(Y x Z, R) — X, since Hom(Y x Z, R) is the linear span of all o ®p. 

This also covers V and W themselves, at least up to isomorphism, since in finite dimension 
we have the isomorphism (2.36), so that, identifying V with V** etc., we obtain 


VQW &V* 9W*™ = Hom(V* x W*,R); (2.44) 

(v&w)(0,T) = 8(v)t(w), (2.45) 

where this time v € V, w € W, 0 € V*, and re W*. Once again, B’ : Hom(V* x W*, R) > X is 
uniquely defined by linear extension of B’(v @ w) = B(v,w), since the linear span of all v@w 


equals Hom(V* x W*, IR). We have effectively identified v with Ŷ and w with w, cf. (2.36), and 
this shows up: although (2.42) gives Y* © R = Y* as expected, eq. (2.44) has the consequence 


V @R=Hom(V*,R) = V*, (2.46) 


where one would prefer to see V. But although no one would criticize the realization V & R = V, 
eq. (2.46) reconfirms that tensor products are merely defined up to isomorphism, cf. (2.36). 
Similarly, instead of V* & V = Hom(V x V*, R), as suggested by (2.42) and (2.44), we may take 


V* @V =Hom(V,V), (2.47) 
since one has an isomorphism Hom(V x V*, R) — Hom(V,V), given by linear extension of 
w80 => (v= O(v)w). (2.48) 
With v € V and 0 € V* as before, the inverse of the map (2.48) is given by 
Hom(V,V) + Hom(V x V*, R); ọ = Ô; ĝô(v,0) = 0(ọ(v)). (2.49) 
In connection with the Riemann tensor we will have occasion to use the induced isomorphism 


V*9V* QW* QW S Hom(V x V,Hom(W,W)); (2.50) 
0; ® HN &wı > ((v1,v2) = (w2 = 9 (vı)&(v2)n(w2)wı). (2.51) 


To describe the inverse of this map we combine (2.42) and (2.44) to pick the realization 
V*@V* @W*@W =Hom(V x V xW x W*, IR). (2.52) 
The image @ € Hom(V x V x W x W*, R) of ọ € Hom(V @V,Hom(W,W)) is then given by 


Ê(vi vzw, N) = n(P(v1,v2)(w)). (2.53) 


134 The construction applies in general if we define V &W as the finite linear span of all a&b in Hom(V* x W*, R). 
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2.4 Cotangent bundle 


Now that we have the tangent bundle TM and the constructions in §2.3, all relevant vector 
bundles on M that are relevant for GR follow. First, the cotangent bundle T*M is defined as 


T*M := UxemT; M; T/M = (T.M)* :=Hom(T,M,R), (2.54) 


i.e. 7M is the dual of the vector space T,M, consisting of all linear maps 0, : T,M > R. The 
smooth structure of T*M is the unique one such that elements 0 € T(T*M) =O!(M) =Q(M), 
called covectors (or 1-forms), consist of those maps x — Ox for which the function x ++ 0,(X,) 
from M to R is smooth for each vector field X € X(M). Since TyM = IR" we also have T*M © R”, 
so that, like the tangent bundle TM, also the cotangent bundle T*M is an n-dimensional vector 
bundle over M. In a coordinate systems (xi ) defined by some chart, T*¥M has basis (dx!, es jax) 
defined by dx'(d;) = öl, which is dual to the basis (dı,...,0,) of 7,M defined in (2.20). Thus 


0 = } Hd; 6; = 0 (ð;). (2.55) 


For an equivalent view of dx’, one may define the exterior derivative d:C”(M) > Q(M) by 
df(X):=X(f). (2.56) 


Then dx’ coincides with do’, where x’ = (x) as usual, and in coordinates (2.56) simply reads 


df = L (54) dx’, (2.57) 


More generally, let (e4) be a basis of T,M, with dual basis (@“) of T¥M (i.e. @° (ep) = 5/). 
Once again, if we expand 0 =), 0,0“, we have 0, = O (ea). This may be done at a single 
point, but bases like (dı,...,0,) and (dx!,...,dx") are defined at each x € U on which the 
coordinates x = @'(x) are defined. Similarly, some basis (e4) may be defined at each x € U, 
where U € Ê (M) need not even be the domain of a chart. In that case (ea) is called a (moving) 
frame or an n-bein (so that in GR one has a vierbein or tetrad). Abstractly, if E — M is a k- 
dimensional vector bundle, one may locally find k linearly independent cross-sections (u1,...,Ux) 
of E and expand any se T(E) by s(x) = Yj 5;(x)uj(x), where sj € C”(M) and uj € T(E). 

Whereas tangent vectors push forward from M to N under maps y : M — N, covectors pull 
back from N to M, like functions: besides the pull-back y* : C*(N) — C*(M) on functions, 
any (smooth) map y induces a pullback y* : O(N) — Q(M) on 1-forms by 


(WO) (Xx) = By (WÄR), (2.58) 
where 0 € O(N) and X, € TM. For any f € C*(N) with df € O(N), this yields 
y“ (df) =d( wf). (2.59) 


A decent vector bundle map y* : T*N — T*M is defined only if w is a diffeomorphism: for 
0, € T*ŽN (y € N), we need x = y~! (y) € M, so that the pullback w;(0,) € TŽM is defined by 


(y5 Oy) (Xx) = OX). (2.60) 


If y is merely injective, then we still obtain a map y* : T*(w(M)) > T*M in this way. 
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2.5 Tensor bundles 


For (k,1) € N x N we define a vector bundle 7“) M over M in the usual way via its fibers 
TM := Hom((T,M)* x (T*M)!,R), (2.61) 


i.e. the vector space of (k +/)-fold multilinear maps from (T,M)* x (T*M)' to R. These fibers 
comprise the total space of the bundle as a disjoint union 


TM := Ue TM, (2.62) 
whose manifold structure will be defined below (by defining the smooth sections). We then have 
TOM =M x R; TM = T*M; TOM =TM, (2.63) 
where in the last entry we used (2.36). Repeatedly using Proposition 2.6, taking (2.44) as a 
realization of “the” tensor product, and once again using (2.36), we obtain the realization 
T” M = (@*T*M) @ (Q'T,M), (2.64) 
where &/V is the / times iterated tensor product of V with itself. According to (2.64), the fiber 
TM consists of finite sums of elementary tensors 4 © --- © QkQ v1 Q- Qv, defined for 
a; € TZM(i=1,...,k); vETM(j=1,...,l). 
In terms of (2.61), one has 
8 DADOV D Dv(X -Xk O,...,0/) = ay (X1) Ak(Xp)vi (0!) ---v(0”, 
where each X; € T,M and each 0) € T*M. We then define T(T(*!)M) as the set of all cross- 


sections x > Ty from M to TOM (i.e. maps such that 7, € TM) for which the map 


x> T;(X1(x),...,Xe(x); 0! (x),...,0!(x)) 


from M to R is smooth for each (X1,...,X;;0',...,0/) with X; € X(M) and 0/ € O(M). This 
equips the vector bundles TDM with a manifold structure, in that we declare T(r! IM ) to 
be the space of smooth cross-sections of T*)M. Elements of rr! )M ) are called tensors (or 
tensor fields (if 7, is regarded as a tensor). In GR, TOM and TDM will be very important. 

All this can be rewritten in terms of indices. In terms of the (coordinate) basis (0},...,0,) of 


T,M with dual basis (dx',...,dx") of T*M, the fiber 7“) M then has a basis 
(dx! @---@dx"* @0;, DD) (2.65) 


where all indices run from 1 to n. Thus TPM is an n*t!-dimensional vector bundle. Like 
vectors, tensors at x may be specified by their components with respect to some basis of 7,M and 
associated dual basis of T*M. In the usual coordinate basis (0;) we have 

= (x) dx" @---dx* @ Oj, B+ Bj; (2.66) 


qini (x) = Toldi e.e’ dp dx” yrik ,dx!'), (2.67) 


ijik 


where we use the Einstein summation convention: repeated indices are summed over. 
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That is, the right-hand side of (2.66) should really be preceded by}, i,j... =1° 
Similarly, in an arbitrary basis (e4) of 7,M with dual basis (0%) of T*M one has 


Tx = TEP! (x) OM @ --- 0% @ ep, @-- De; (2.68) 
TS eee age), (2.69) 


We write £% (M) for the space of cross-sections I(T“) M) of TM, so that 
x) (M) = C*(M); 0D (M) = &(M); x) (Mf) = Q(M). (2.70) 


A tensor T € ¥()(M) of type (k,/) maps k vector fields (X1,...,Xx) and Z covector fields 
(0t, ...,0! ) to a smooth function on M by pointwise evaluation, i.e. 


T: (M)‘ x O(M)! > C*(M); (2.71) 
7(X1,...,X4,01,...,0/) sx O(X1(x),....Xe(x);0!(x),...,0/(x)). (2.72) 


This map is evidently k + /-multilinear linear over C” (M), in the sense that e.g. 
T(fiX1,...,feXe.g10',...,810') = fier fie gi gi (Xn. Xe 0',...,0/), (2.73) 
for all fi,gj € C” (M); here we use the fact that X(M) and O(M) are C”(M) modules. 
Proposition 2.7 (tensoriality test) A map 
7: X(M)‘ x O(M)! > C*(M) (2.74) 


is given by a tensor 
t e xX) (M) (2.75) 


through (2.72) iff T satisfies (2.73), i.e., iff it is C° (M )-multilinear in all entries. 
Proof. The proof is easy in local coordinates, where (2.73) yields 


ae NGO esr) = TX Cisne Rp ee) 
= Xl KO) Ol T(di -v03 dx,.. dx"), (2.76) 
so if we define the components gh 2: (x) of t, by (2.67) and subsequently define T, itself by 
(2.66), we have found the desired tensor that via (2.72) reproduces the given map T. 


Eqs. (2.66) - (2.67) imply the transformation properties of tensors under changes of coordi- 
nates (i.e. charts), which historically even defined tensors: in the situation of (2.27), 


(Ti (xg) —_ £ ee i S A er 7 : (Ta). (xa) (2.77) 
ox oxy Əxşp 2 7: : 


where the “new” coordinates (xg) = (xg, . X) are functions of the “old” coordinates (xa) = 
(x},,...,x%,), cf. (2.28), and hence the matrix (x. / 9x4) is defined as the inverse of the matrix 


(Axz / 9x), both seen as functions of the (x(,). Note that the argument xg in (2.77) refers to the 
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same point x € M as the argument xg (but in different coordinates). Conversely, from a “tensor” 
(in the original historical sense of the word) Tj! r age: '(x) we obtain a map T of the kind (2.71) by 


UX....X6 01... 0) T(x) XT! (x) XE (x) 87, (x) ++ OF (x). (2.78) 
It then follows from (2.77) that Tt is well defined in being coordinate-independent. It is also 
k + I-multilinear linear over C*(M) by construction, so that we recover (2.71) - (2.73). 

A smooth map 


v:M—N (2.79) 
induces a (vector bundle) map 
yo) pO + TOON (2.80) 
via the obvious pointwise maps 
LO). POD My — TOON, (2.81) 
y(x) 
However, to extend this to a map 
wi) TDM > TDN, (2.82) 


we need y to be invertible (with smooth inverse), in which case we may as well take N = M and 
assume that y : M — M is a diffeomorphism. In that case, we have 


RAR). U) = BOE RR). Were): 283) 
Na) ale). 284 


This can also be done with y replaced by w!, giving maps 
Wey TM > THON, (2.85) 
which in turn induce maps on the sections 
Wey) : EED (M) EM), (2.86) 
often just called w*, via 
VdK) RR) ER). 2.87) 


In particular, Wo 0) is the map y* from (2.58), whereas Yo 1) = Ve ! (recall that y, = y’). 
A natural operation on tensors, which is often used in GR, is tensoring: if 


a € x) (M) and m € x) (M), (2.88) 


then 
Ti Q T € XE tkali) m) (2.89) 
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is defined by concatenation, i.e. 


18 (X1, --- Xk Yis- --Yp301,...,0",p!,..., p2) := 
Ti(Xis-- -Xg 0,10%) Bl. -Yi p,- p®). (2.90) 
Indeed, ¥(*!) (M) itself arose in this way by tensoring copies of X) (M) and €!)(M). 


Another important operation for GR is (index) contraction: If k > 0 and / > 0, then a tensor 
Te Xl) (M) may be contracted along one fixed upper and one lower index, say i and j (the 


result depends on this choice) so as to obtain a tensor o € €—!/-) (M) with two indices less. 


Let (ea) be a basis of TyM, with dual basis (@*) of T¥M (i.e. @*(e,) = 67); in local coordinates 
one could take the (9;) basis, with dual (dx'). Then 


oh (x) = Dee (x), (2.91) 


a,...,A 


where, according to the Einstein summation convention, a is summed over, and a hat means that 
the given index is omitted. This is easily seen to be independent of the basis. 

Finally, the Lie derivative “x, so far only defined on vector fields, may be extended to a 
linear (and “C*(M)-Leibnizian”) map 


Ge 260M) = ENM) 290) 


in two equivalent ways: 


(Kl) 


e Concretely, writing Zx for Zg” for simplicity, one may define 


Zr = lim( yf (t)—1)/t (ce x) mu), (2.93) 
I 
cf. (2.34). In local coordinates, this gives the following explicit formula: 


(Lx tit Ji xi iO.) + (8, Oe, We Baie -+ (ði, xa Ji 


Ip ik Yel iik ipei 
- (KT er — (0;X") a)" J (2.94) 


ilik ieig? 


of which (2.8) is clearly a special case. 
e Axiomatically, one may define the x as the unique linear maps satisfying the rules: 
1, Gl) f = x f for functions f € C”(M) = XM; 
2. yoy = [X,Y] for vector fields Y € £(M) = x&("))m; 
3. (ZUe)(r) = £x(0(Y)) — 0(.ZxY) for covector fields 0 € O(M) = Xu; 


4. N (0 @T) = (Ko) ®T+ 08 L4T for all higher-order tensors (Leibniz rule). 


It follows from either (a)—(d) or (2.94) that for all cases IE) = y one has the lovely rule 


(Lx, Ly] = xy] (2.95) 
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2.6 Manifolds with boundaries and corners 


For the action principle in GR as well as for things like Penrose diagrams or Cauchy horizons 
we will need an extension of the manifold concept defined in §2.1 so as to incorporate (smooth) 
boundaries and, sometimes, corners (which lead to non-smooth boundaries).!*> For n > 1, let 


R? := {x € R” |x" > 0}; R eR 20, (2.96) 


Definition 2.8 7. A C*-manifold with boundary M is defined in the same way as a manifold 
(cf. §2.1), except that one replaces R” by IR". throughout the definition. In particular, each 
point x € M has anbhd U € O(M) that is homeomorphic to some V € G(R". ). 


2. Similarly, a manifold with corners is defined from the model space Rn. instead of IR". 


3. In these definitions, C*-regularity of the transition functions PB © Pa ! (see Definition 2.1.4 
in §2.1) is defined by declaring F : V — R”, where V € O(R".) or V € O(IR".), to be CK, 
0< k< æ, iff F can be extended to a Ck map on some open nbhd of V in IR".'*° 


4. In both cases a map f:M —> R is C* iff the map fo Øz! : Va > R is C* for each a. 


5. The boundary 0M of a manifold M with boundary or corners is the set of all x € M whose 
image (x) in some chart (U,@) with x € U lies on the (topological) boundary 09(U) 
of (U) in IR" (this is independent of the chart). 137 In addition, a boundary point of a 
manifold with corners is a corner point if at least two of the coordinates of (x) vanish. 


6. The interior int(M) is defined as M\OM. 


7. For k = œ, the tangent bundle is defined exactly as in Definition 2.4. In particular, for 
any x € M, the tangent space T,M is the space of all point derivations (2.2) of C° (M). 


The boundary of a manifold with boundary is itself a manifold (without boundary or corners), 
in the same class C* as M itself, of dimension n — 1 (i.e. one less than M). This should be clear 
for IR“ itself, where OR" = {x € R” | x” = 0}, which is clearly = IR" !. However, corner 
points typically ruin C* regularity of the boundary; removing them leaves a disconnected C* 
boundary. On the other hand, in both cases int(M) is again a “plain”, n-dimensional manifold. !** 

Surprisingly, for M = IR", the tangent space is just TR” = TR” even at x € OR”, , and also 
for general M the fibers T,M are vector spaces with a coordinate basis (0/0x!',...,0/0x") at 
any x € M. This makes it possible to define tensors and (semi) Riemannian metrics as usual. 

To recover the intuition that tangent vectors at boundary points x € dM should be directed 
inwards (at least without corners), note that the set-theoretic complement T,M\T,0M of Tð M 
is the disjoint union of two open half-spaces of which one, call it TÌM, consisting of inward 
tangent vectors, is distinguished by the property that for any X € TÝM there exists a smooth (or 
C*) curve c : [0,€) — M for which c(0,€) € int(M) and X f(x) = d f (c(t)) /dt,—o, as usual. 


135See Lee (2012) for both boundaries and corners, and Gallot, Hulin, & Lafontaine (1990) for boundaries. 
Manifolds with corners are usually studied using the b-calculus of Melrose (1996). 

136For k = œ, Seeley’s extension theorem states that this is equivalent to all derivatives of F being bounded on all 
bounded subsets of the (topological) interior int(V) of V (Seeley, 1964). See also Grieser (2000). 

137Either x € M has an open nbhd U = Ve G(R"), in which case x ¢ OM, or it doesn’t, in which case x € OM. 

138 A basic result is the collar neighbourhood theorem, which states that if M is a smooth manifold with boundary, 
then 0M has an open nbhd in M that is diffeomorphic to 0M x [0,1). See e.g. Schultz (undated). 
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2.7 


Summary 


Differential geometry gets going as soon as we define the space C*(M) of smooth (real- 
valued) functions on a manifold M; this is done through local charts ọ : U — R” (U C M). 


The coordinates (x',...,x") of x € U with respect to @ = (@!,...,@") are x! = g(x). 


The tangent bundle TM to M is the union TM = L1,cmT,M, where TM is the space of 
point derivations at x, defined as linear maps 6, : C*(M) — R that satisfy the Leibniz rule 


ö,(fg) = &(f)g(x) + f(x) &(g). Each 6, takes the form 6,(f) = Zf (Y(t))y=0, where 
y: (—€,€) — M is some curve through x = y(0); then 6, is called a tangent vector Xx. 


A smooth section x ++ 6, of TM corresponds to a derivation ô :C”(M) + C” (M), i.e. a 
linear map satisfying ö(fg) = 6(f)g + f(g). Conversely, each derivation defines point 
derivations 6,(f) = 6(f)(x). Seen as x +> Xy, a derivation 6 = X is a vector field on M. 
The set of all vector fields on M is denoted by ¥(M). It is naturally aC*(M) module. 


The coordinates of X, € T,M with respect to @ are X i — XQ! (where X, = 6, is restricted 
to U), and one has X, = X'0;, where 0; = 0 /Ox', and (0},...,0,) form a basis of T,M. 


The cotangent bundle T*M to M is the union T*M = L,emT,;M, where T*M is the linear 
dual Hom(7,M, R). Each C” (M)-linear map 0: ¥(M) — C” (M), called a I-form, comes 
from a cross-section x ++ 0, with 0, € T*M. The set of all 1-forms on M is called O(M). 


The exterior derivative d : C*°(M) — Q(M) is canonically defined by df(X) := X(f). 

The coordinates of 6, € T*M with respect to @ are 6; = 0(0;), and then 0 = O;dx'. 

The vector bundle TKM = UTM of tensors of type (k,l) over M is defined by 
T=” M := Hom((T,M)* x (T*M)!,R) = (Q T*M) & (8'T,M) 

The cross-sections x Ty € TM) are the maps t : X(M)* x Q(M)! > C*(M) that 

are k + /-multilinear linear over C”(M). These maps, also called tensors, form X") (M). 


Important special cases are: 7) = T*M, so that ¥(-)(M) = O(M), and TO) = TM, 
so that (!) (M) = X(M). Furthermore, the metric tensor g of GR will be in X0) (M). 
The coordinates oe of T € TM are given by T;(0j,,...,0;,;dx/!...,dx/!), and we 
have 7, = heey (x) dx"! @---dx'k @ Oj, Q-Q; For the metric, this gives gi; = g(dj,0)). 
For each vector field X € X(M), the Lie derivative Ly : x") (M) — ¥) (M) is a linear 
map that satisfies Zy(ft) = (Xf)t+ f-/x(t) for each f € C”(M) and t € X% (M). 


The Lie derivative satisfies three important properties: for vector fields Y € X(M) one has 


A] = yxy] as well as Y = [X,Y], whilst “yf = X f on functions f € C” (M). 


Unless stated otherwise, all maps between smooth objects are required to be smooth. 


The Einstein summation convention holds: repeated (diagonal) indices are summed over. 
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3 Metric differential geometry 


The main object of study in GR is the metric tensor g € X (2.0) (M). This is a smooth family 
&x:TMXTM >R (3.1) 


of metrics as defined in $2.3, where now V = T,M is indexed by x € M. Thus, repeating the 

definition, each gx is bilinear, symmetric (i.e. 8x(Xx, Yx) = 8x(Yx, Xx) for all Xx, Yx € T,M), and 

nondegenerate (i.e. 8x(Xx, Yx) = 0 for all Yy € TyM iff X, = 0). It need not be positive definite. 
The orthonormal basis (ea(x)) = (e1 (x),...,e€n(x)) in which g, is diagonal (cf. §2.3) may- 

and typically will-depend on x. But if M is connected, the signature of gx is independent of x by 

continuity. Even if M is not connected, we assume it is independent of x. Thus the signature is 

an intrinsic property not only of each pointwise metric gy, but even of an entire metric tensor g. 
A manifold M with a metric tensor g is called semi-Riemannian, with two special cases: 


1. The metric (or manifold) is called Riemannian if the signature is (+ - - - +). Thus each gy is 
positive definite. Given the assumption of symmetry, this implies that gx is nondegenerate, 
so a metric tensor is Riemannian iff each gy is symmetric and positive definite. 13° 


2. The metric (or manifold) is called Lorentzian if dim(M) = 4 andn_ = 1, i.e. the signature 
of gis (—+++).'*° Hence with respect to an orthonormal basis (e4) we have 


gleas eb) = Nab; n := diag(—1,1,1,1). (3.2) 
With some abuse of notation, the symbol nn, with n = n4, is also used for the Minkowski metric 
Tn: R” x R” > R; M(X,Y) := NX Y’ = -X°Y? +Y IX Yi, (63) 


where (X#) and (Y#) are either meant to be Cartesian coordinates on R4 seen as a vector 
space,!*! or, identifying T, Rf = Rf, denote components of tangent vectors X = Xd, etc. 
with respect to the basis (do, d1, d2, 03 ) defined by the usual coordinates (x) on IR‘, seen as our 
manifold M. Either way, (R*,n), often written as (M, n), is Minkowski space-time, which is 
the oldest and simplest example of a Lorentzian manifold. The fact that, in this special case, the 
metric is defined not only on each tangent space 7,M, as always, but also on M itself, has no 
analogue for general Lorentzian manifolds. In special relativity, however, lightcones and other 
causal structures are defined in M = R4 = M itself, which makes it useful to define the metric n 
on both M and TyM. Causal theory for general Lorentzian manifolds will be developed in §5.3. 
Lorentzian manifolds underlie GR, but we often invoke examples from Riemannian geom- 
etry in order to explain some contrast with the Lorentzian case. Furthermore, Riemannian 
submanifolds of M are often important, e.g. in the Cauchy problem for GR (see chapter 7). 


139The case (—--- —) may also be included here, since an overall change of sign in g makes it Riemannian. 

140 This name is sometimes also used in any dimension d > 2 provided n- = dim(M) — 1. Furthermore, a similar 
comment as in the previous footnote applies: we may as well take n} = dim(M) — 1. In any d > 2, a necessary and 
sufficient condition for a metrizable manifold M to support a Lorentzian metric is that M is either non-compact, or, 
if it is compact, has zero Euler characteristic. These conditions are equivalent to the existence of a non-vanishing 
continuous vector field on M (Palomo & Romero 2006, §1.1; Minguzzi, 2019, §1.8). For deeper topological 
constraints imposed by Lorentzian metrics with additional (causality) properties, see Chernov & Nemirovski (2013). 
But in GR one often starts with a metric defined by some formula and looks for a manifold supporting it! 

141 Seen as Minkowski space-time, it is conventional to relabel the usual coordinates of Rf as (x? x! i Se h where 
x? =t denotes time. In diagrams, the time axis is always drawn vertically. We also introduce a convention often 
used in the (physics) literature: Greek indices u, v etc. run from 0 to 3, whereas Latin indices į, j etc. run from | to 
3. Both Greek and Latin indices midway in the alphabet usually refer to the canonical coordinate basis ðu = 0 / dx” 
or 0; = ð /ðxİ, whereas indices a,b etc. typically refer to other bases (e4), often orthonormal ones. 
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3.1 Lowering and raising indices 


Let (M,g) be a (semi) Riemannian manifold. Since each g, is a metric, the distinction between 
vectors and covectors is blurred, because as in §2.3 we have “musical” isomorphisms 


bx : YM — TM; (X) =X’; X?(Y) := gx(X,Y); (3.4) 
te: TŽM > TM; #x(0) = 04: 8&x(0;:X) := 0 (X), (3.5) 

which are each other’s inverse. These pointwise isomorphisms induce mutually inverse maps 
b: Z(M) > Q(M); t: Q(M) > X(M), (3.6) 


by pointwise application. This leads to the lowering and raising of indices, which is crucial to 
almost any computation in GR. At any point x (which we omit) we define (g“) as the inverse 
(matrix) to (gap), where gap = 8 (€a,€ep) in some basis e4 (so that gg, = 62). Then 


Dei: Of = gO, (3.7) 


which notation may then be extended to any tensor, where the “sharp” and “flat” signs are 
usually omitted. For example, (3.7) is simply written as X, = 8.»X” and 0° = g“°9,. 
The above definition of (g@°) is consistent with the following one. Extending fy to a map 


ty Q fx : TM 8 TŽM > T„M & T,M (3.8) 
in the obvious way, i.e., by linear extension of 0 & N +> 0; 8 Ny, we obtain 
HS te(ge) € TOM = Hom(T*M x T*M,R). (3.9) 


If (œf) is the dual basis to (eq), then g@ = fx Q ty(gx)(@%(x), @?(x)), as the reader will verify. 
More generally, lowering and raising of specified indices are maps defined, respectively, by 


b: KM (m) = HD) m); y: LED (m) => EHD (M), (3.10) 
provided / > 0 in the first and k > 0 in the second case. Taking the first index for example gives 
T (Riean O e N =T O aih eaa N (3.11) 
ee TO Krae aaa), (3.12) 


Curvature will described by the Riemann tensor R € €(3!) (M), of which the only upper index is 
usually written first. This index may then be lowered, so that R? € ¥(40) (M) has components 
Ra = Rabed = SaeR bed: (3.13) 


The contraction process explained at the end of the previous chapter, which in principle has 
nothing to do with the metric, may now elegantly be rewritten in terms of the metric by, e.g., 


Rab = Recs = 8 Raacb = 8 Riach (3.14) 
Metric contraction may be done even in case where the original version does not apply, as in 
R= Ri, = 8 Rap (3.15) 


If R € X61) (M) is the Riemann tensor, so that its first contraction R € €(2°) (M) is the Ricci 
tensor, this second contraction yields the Ricci scalar, which again plays a central role in GR.'*? 


'42Our use of the same letter R for the Riemann tensor, the Ricci tensor, and the Ricci scalar will never lead 
to confusion, as all relevant instances contain indices distinguishing them. For experts: we do not use Penrose’s 
abstract index notation, which may clarify things but ever so often leads to typographically awkward expressions. 
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3.2 Geodesics 


Intuitively, geodesics are paths of shortest lengths between two given points.!*? This idea only 
makes direct sense in the Riemannian case (as opposed to the semi-Riemannian case), with 
which we therefore start. We will then find a redefinition of a geodesic that does make sense also 
on semi-Riemannian manifolds. Throughout this section (M,g) is a Riemannian manifold. It 
will now be convenient to use closed intervals I = [a,b] as the domains of curves y: I > M. 


1. The length of a curve y: [a,b] > M is defined as 


un = fat or) = farina G.16) 


where y(t) € TyM is the tangent vector to the curve, cf. (2.25). So in coordinates one 
has y(t) = (y'(t),...,7"(t)), where 7’ : [a,b] — IR, and hence 


HN) = gil) I) =. 61D 


Using a change of variables in the integral (3.16), it is easy to show that the length of y is 
independent of its parametrization, so that it only depends on the image y({a,b]) in M. 


2. If M is connected, any two points can be connected by a smooth curve, and hence we can 
define the distance d(x,y) between x,y € M as the infimum of L(y) over all smooth curves 
y: [0,1] + M with y(0) = x and y(1) = y (one may equivalently use piecewise smooth 
curves here, since these can be smoothened, cf. Lemma 5.8 below). This is a metric on M, 
whose metric topology coincides with the original topology of M.!** 


3. A geodesic is a curve of extremal length (with a specific parametrization, see below). 


We will not precisely explain what this problem in the calculus of variations means, since our 
goal is merely to motivate Definition 3.1 below, which also applies to the semi-Riemannian case. 
Therefore, we just outline how this extremal problem is solved. In general, a functional 


b 
S) = | dir.) G.18) 
is minimized or maximized by some curve y iff the Euler-Lagrange equations hold: 
dof df 
een 3.19 
dt oy’ oy' en 


Short of giving an introduction to the calculus of variations, here is a heuristic derivation of 
(3.19). Let y(t) a family of curves indexed by s, such that endpoints are fixed, that is, 


(a) = Ya); 1,(b) = y(b). (3.20) 


'43Recall our standing assumption that all maps, including curves and metrics, are smooth. Uniqueness and 
variational properties of geodesics change completely if the metric is just C! (Hartman & Wintner, 1951; Hartman, 
1983). On the other hand, most of the smooth theory is already valid in the Hölder class C>! (Minguzzi, 2015a). 

14 See Jost (2002), pp. 14-15. We do not prove this since it is practically irrelevant for the Lorentzian case. 
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The extremality condition that defines the variational problem is 
dS(y)/ds=0. (3.21) 
On repeatedly using the chain rule and a partial integration, eq. (3.21) with (3.20) gives 


dSs(%) = f'a (= dy OL 2t) = f'a (= dy ALA =) 
ds a dy. os ayi as) Ja dyi ds OY! dt ds 
- [a (-537) ay “IL dy, 
a oyi atoayi) as |, 0% ds’ 


(3.22) 


Then (3.20) gives dy,(a)/ds = dy,(b)/ds = 0, so that, for arbitrary y, and hence arbitrary 
0Y;/ ðs, eq. (3.21) implies (3.19), in which s is dropped and hence 0 / ðt becomes d/dt. 

The Euler-Lagrange equations for the length functional (3.16) are not very nice, but they can 
be simplified if a preferred (“affine”) parametrization is used. To motivate this, instead of the 
length (3.16), we now start from the (kinetic) energy of our curve y, defined as 


EQ) = f dren N) = | ala? 23) 
For the energy (3.23), the Euler-Lagrange equations (3.19) give the geodesic equation 
PA FTV) KOK] = 0, (3.24) 
or briefly Y +T} Ņ = 0, where 7 = d”y/dt?, and the Christoffel symbols are given by 


Pe = z8” (E&mjk T Emk,j — 8 jkm)» (3.25) 
where we have introduced another useful notational convention from GR: 


cer rare (3.26) 
Warning: the Christoffel symbols do not form the components of a would-be tensor “T € 
x) (M)”: physicists see this from their incorrect behaviour under coordinate transformations, 
whereas mathematicians note that I fails the tensoriality test, cf. Proposition 2.7. We will see, 
however, that the I'-symbols do combine into the Riemann tensor! 

To derive (3.24) for (3.23), i.e., for Z (y(t), W(t) = gii(v(t)) (EY (t), one uses 


OL Er 

ay = Zii Ý; (3.27) 
d oL d _; kei ai er i 
aay 28V = Mg kY +87) = (ijet ss) V +28. (3.28) 


Whereas solutions of (3.24) extremize the energy for any parametrization, for the length 
(3.16), the Euler-Lagrange equations only take the form (3.24) iff ||7(t)|| is constant, in which 
case the parametrization of the curve y: [a,b] — M is said to be affine. In particular, if ||y(r)|| = 1 
for allr € J, then we say that y is parametrized by arc length. 


Definition 3.1 A geodesic is a curve y: I — M (with I C R connected) that satisfies (3.24). 
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On this definition, geodesics still extremize length, but eq. (3.24) implies that ||Y(r)|| is constant, 
as can be shown by computing d(||Y(r) ||?) /dt from (3.17). This time-derivative equals 


d|\y(t)||* ER ER 

= Kol = sii eV YY + gy ý. (3.29) 
Eliminating / via (3.24) then leads to a cancellation making the right-hand side zero; a neater 
calculation will be given after (3.66). The definition of a geodesic therefore depends on the 
parametrization of y: a reparametrized geodesic may no longer satisfy (3.24), except when the 
reparametrization is affine, i.e. s = at + b. However, one has the following useful criterion. 


Proposition 3.2 Some curve y: [a,b] — M can be reparametrized so as to become a geodesic 
iff the right-hand side of (3.24) equals f - y' for some function f(t) defined along y. 


Proof. If some curve t ++ y(t) satisfies (3.24), then t ++ y(s(t)) satisfies (3.24) with right-hand 
side 5%, and conversely one can solve f(t) = s(t) for s and switch to yos |, 


Such a (poorly parametrized) curve that is “almost” a geodesic is sometimes called a pregeodesic. 
In M = R” with flat metric (i.e. g;; = 0;;) geodesics are straight lines that form shortest paths 
between two given points. This is also true in e.g. hyperbolic space, and it is always true for 
sufficiently short geodesics. On the sphere (where geodesics are great circles) one has two 
geodesics between two generic points; but only one has minimal length. These lengths coincide 
iff the two points are polar opposites, in which case one has infinitely many geodesics. See §5.5. 

In the intuitive idea of geodesics the focus is on endpoints, whereas in defining geodesics as 
solutions to the ODE (3.24) the focus is on the initial point y(0) and initial velocity y(0). The 
solution to (3.24) is uniquely defined by these data, except for /. But like any solution to an ODE, 
y has some maximal domain of definition Z C R, and this domain may or may not equal R. 


Definition 3.3 [fall geodesics y: I — R with given initial point y(0) and initial velocity y(0) 
can be defined on the maximum interval I = R, we say that (M,g) is geodesically complete. 


For example, IR”, the sphere S”, and hyperbolic space H” are geodesically complete (cf. §4.4). 
In the Riemannian case this is equivalent to a purely topological property. For x,y € M define 


d(x,y) := inf{L(y) | y: [a,b] > M, (a) =x,y(b) = y}. (3.30) 


It is easy to show that this defines a metric in the topological sense, i.e. a symmetric function 
d:M xM —> (0,-) that satisfies d(x,y) = 0 iff x = y and d(x,y) < d(x,z) +d(z,y). In other 
words, a Riemannian manifold (M,g) is also a metric space (M,d). For the latter, one has the 
usual notion of completeness in the sense that any Cauchy sequence converges. 


Theorem 3.4 (Hopf-Rinow) A Riemannian manifold (M,g) is geodesically complete iff the 
corresponding metric space (M,d) defined by (3.30) is complete. In that case, any two points 
x,y can be joined by a geodesic of minimum length (compared with all curves from x to y). 


Since this theorem has no analogue in the Lorentzian case we will not prove it. We do note that 
any compact Riemannian manifold is complete. On the other hand, examples of incomplete 
Riemannian manifolds are provided by open bounded sets © C R” with flat metric inherited 
from R”, or IR” itself with one or more points or regions omitted. Such examples also show that 
in the incomplete case the infimum in (3.30) may not be attained. Many Lorentzian manifolds of 
interest to GR are geodesically incomplete in a nontrivial (i.e. inextendible) sense; see chapter 6. 
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3.3 Linear connections 


The definition of a geodesic as a curve y whose tangent vector y satisfies (3.24) along the 
curve for which y(t) is defined was inspired by the Riemannian case, but it will be taken as 
the definition of a geodesic on a semi-Riemannian manifold, too. In support, we now give a 
geometric perspective on the Christoffel symbols Ty and hence on the geodesic equation (3.24). 


Definition 3.5 A linear connection on M (which is the same thing as a connection on the 
tangent bundle TM, see below), or, equivalently, a covariant derivative on X (M ) is a map 


X Vy:&(M) = X(M), 331) 
where X itself is a vector field on M (i.e. X € X(M)), such that: 


1. The map X > Vx is R-linear as well as C*(M)-linear, i.e. 


Vex¥ =fVrY Vf €C?(M); (3.32) 


2. The map Y > VxY is R-linear but not C*(M)-linear: it satisfies the Leibniz rule 
Vx(fY) = (XP) + fVx¥ vfec”(M). (3.33) 


This definition also makes sense on any open U € O(M), and in fact if x € U, then VxY (x) only 
depends on the value of X at x and the restriction of Y to U; this follows from (3.32) - (3.33) 
and the definition of a manifold. Hence we may compute covariant derivatives locally. Recall 
that a local frame (ea) for X(U) consists of n maps ea : U — TM such that at each x € U the 
vectors e,(x) € T,M form a basis of T,M (a = 1,...,n). The corresponding dual basis (@“) for 
Q(U) then consists of the œf (x) € TM that satisfy œ (ep) = ôf. The given connection V is 
then completely characterized by its connection coefficients @/,, defined (at each x) by 


Vege = Dec. (3.34) 


a 
Indeed, from (2.68) - (2.69) we may write X = X“e,, where X“ = œf (X) € C” (U), so 
VxY = V yae, (Yep) =X"V, (Yep) 
= X“ (ea(Y”) -ep + Y°Ve, ep) 
= X"(eq(Y°) + YS, ec. (3.35) 
We write VxY“ for (VxY)*, so that VxY = (VxY“)e.. We therefore have 
Vx¥4 = X (Y°) + wh X°YS, (3.36) 
where X(Y“) is the action of the vector field X on the function Y" € C”(U). In terms of a 
coordinate basis (eu = Ay), (@” = dx”), writing Vy, := Vg, the above relations imply 
Oly = ax? (Vudy); (3.37) 
VxY? = X! (9 YP + why”); (3.38) 
Vu YP = dY? tor”. (3.39) 
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Linear connections formalize Levi-Civita’s notion of parallel transport. It follows from 
(3.36) or (3.38) that VxY only depends on the values of Y along the flow lines of X, for 


d 
VxY“ (x) = ae r (x))z=0 + og (AXP NY (x), (3.40) 
where y is the flow of X. Conversely, given some curve Y:/— M with tangent vectors Y (defined 
along y only!), the covariant derivative VY of Y along y is well defined for any vector field Y 
defined near or even on y(/) alone; for in (local) coordinates we have 


Viti) =y" (1) (uY + ony (YE) YY) 
d 


dy" (t 
= EYP + ar) a yy (3.41) 


where y! : I — R are the coordinates of the curve (in some given chart), as before. 


Definition 3.6 A (necessarily unique) vector field t > Yr) € TyM defined along a given curve 
y is the parallel-transport of some initial vector Y € Ty)M along Y if Y satisfies 


GEN (3.42) 


This generalizes the Euclidean practice of freely moving vectors in IR” from place to place, 
to arbitrary (semi) Riemannian manifolds. The price one pays is that such motions can only 
be carried out once a linear connection has been defined. The flat connection on R” (with 
flat metric g = 6), defined in the standard coordinates by Oly = 0 gives V,, = ð, and hence 
Yy) = Yyo) = Y for all r. Hence “freely moving vectors” in R” is relative to this flat connection. 

Like the Christoffel symbols, the connection coefficients do not form the components of a 
tensor (the relation between the two will be clarified shortly). However, various tensors may be 


defined via the connection. For now, we just define the torsion ty € X!) (M) of V by 
Ty (X,Y,0) = 0(VxY-VyX-[X,Y]). (3.43) 


A simple computation shows that this expression is C”(M)-linear in each entry, so Proposition 
2.7 shows T is indeed a tensor of the said kind. In the coordinate basis (9, ), we have 


Thy = Oy — Ou, (3.44) 


since [0,,,0y] = 0. Hence the connection V is torsion-free iff any of the following hold: 


Oi = Ore (3.45) 
VxY —VyX = [X,Y]. (3.47) 


We are now in a position to restate and generalize Definition 3.1: 


Definition 3.7 Given some linear connection V on M, a geodesic in M is a curve y for which 


Viv=0. (3.48) 
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That is, the tangent vector Yto y is parallel transported along y. As before, this definition requires 
a specific parametrization of y, which is unique up to affine transformations of t. One has a 
similar situation as in the metric case for detecting “wrongly parametrized” geodesics: 


Proposition 3.8 Some curve y: |a,b| — M can be reparametrized so as to become a geodesic 
iff the right-hand side of (3.48) equals fý, for some function f(t) defined along Y. 


The proof is analogous to Proposition 3.24. Using (local) coordinates, eq. (3.48) may be brought 
into a form that is strikingly similar to (3.24). Since according to (3.41) with Y ~~ y the expression 
Wavy? is just d?YP /dt? = YP , we obtain 


YP +ofyý" ý” =0, (3.49) 


from which it is obvious that geodesics are insensitive to the torsion (3.44) of the connection. Eq. 
(3.49) looks like the geodesic equation (3.24), with the difference that in (3.49) the coefficients 
Oy are defined by (3.37) in terms of an arbitrary linear connection V, whereas those in (3.24) 
are the Christoffel symbols (3.25) defined by the metric. Their relationship is as follows. 


Theorem 3.9 (Levi-Civita) Any (semi) Riemannian manifold (M,g) admits a unique linear 
connection V (called the Levi-Civita connection) that satisfies the following two properties: 


1. The torsion ty associated to V vanishes, i.e. VxY —VyX = |X,Y]. 
2. The connection V and the metric g are related by the condition that for all X,Y,Z € X (M ), 
X(g(¥,Z)) = g(VxY,Z) + 8(Y,VxZ). (3.50) 
These conditions imply that the connection coefficients of V are the Christoffel symbols (3.25): 
on e (3.51) 
As soon as we have extended V to arbitrary tensors, we will see that (3.50) comes down to 
Vxg=0 VX € X(M). (3.52) 
Also, X(g(Y,Z)) will be the same as Vx(g(Y,Z)), hence some authors elegantly write (3.50) as 
Vx(Y,Z) = (VxY,Z) + (Y,VxZ). (3.53) 
Proof. Using (3.47) and (3.50), one computes 
X(g(¥,Z)) —Z(g(X,Y)) +¥(8(Z,X)), 
and rearranges this to obtain the so-called Koszul formula, partly written in the notation (3.53): 
(VxY,Z) = 3(X(Y,Z) + Y(Z,X) —Z(X,Y) - X, [Y,Z]) + (|X, Y],Z) + (Y,[Z,X])). (3.54) 
Since g is nondegenerate this uniquely fixes VxY, and in a coordinate basis this gives (3.51). 


To prove existence, one easily checks (3.32) and (3.33) from (3.54). Finally, running the 
derivation of (3.54) from (3.47) and (3.50) backwards verifies (3.47) and (3.50). 
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3.4 General connections on vector bundles 


For a more general understanding of the above constructions, as well as for a clean extension 
of linear connections from vector fields to arbitrary tensors (which one often needs in GR), we 
briefly discuss connections on arbitrary vector bundles. Similar to Definition 3.10, we put: 


Definition 3.10 A connection on a vector bundle E — M is a linear map 
XVa DE TE). (3.55) 
where X € X(M), such that: 
1. The map X > Vx is R-linear as well as C” (M)-linear in X, cf. (3.32); 


2. The map s ++ Vys is R-linear but not C” (M)-linear: it satisfies the Leibniz rule 


Vx(fs) = (Xf)st+fVxs (f €C°(M)). (3.56) 


A linear connection is then a connection (in the above sense) on the tangent bundle. The general 
story is almost the same, including the localization of Vxs(x) to the flow lines of X arbitrarily 
close to x, and hence to any U € @(M), x € U. In particular, define a local frame (ua), where 
a=1,...,k = dim(E,), i.e. the rank of E, by the properties that (i) ua € T(U,E), i.e., the 
restriction of T(E) =T(M,E) to some U € O(M); and (ii) the set ua(x).=1,...dim(z,) forms a 


X 


basis of E, for all x € U. This once again yields connection coefficients defined by 
V pup = Chplte- (3.57) 


The difference with the tangent bundle is that the three indices carried by C are no longer of the 
same type: b and c label basis vectors in Ey, whereas u refers to the canonical coordinate base of 
TM (recall that V, = V3,). Writing s(x) = s° (x)ua(x), we now have 


Vus? = dus? + Cips”, (3.58) 
cf. (3.39). This is often written as 
Vus = Aus + Ous, (3.59) 


in which s is seen as a vector with components s? relative to the given basis (ua) and hence @, 
is a matrix with components CÌ p» or s € T(E) and @y (x) € Hom(E„E,).'” 

A vector bundle E may be equipped with a metric, i.e. nondegenerate symmetric bilinear 
form 2x : Ey x Ex — R defined for each x € M, that is smooth in x in the sense that for any 
s,t © T(E) the function g(s,t) : M — R defined by x > gx(s(x),t(x)) is smooth. For example, 
a (semi) Riemannian metric on M is a metric on E = TM in precisely this sense. A connection 
V on E is then called metric if for all s,t € T(E) we have 


X(g(s,t)) = g(Vxs,t) + 9(s, Vxt). (3.60) 


145 Even more abstractly, connections may be regarded as maps V : T(E) +I (T*M & E) = Q! (E), ie. the space 
of E-valued 1-forms, that satisfy V(fs) = df Qs + fVs; the connection with the main text is Vys = Vs(X). In that 
case we may write V = d +, where œ € O'(Hom(E,£)), i.e. @ is a 1-form taking values in the vector bundle 
Hom(E,E). Even more generally (for those familiar with the de Rham complex 0° (M) and its relative O° (E)), we 
may define V : QP (E) > OP*!(E), where p = 0,...,k with Q? (E) =T(E), as the unique extension of the above 
map V : 0°(E£) — O! (E) that satisfies V (œ @s) =da@s+(—1)?aAVs, where œ € QP (M) and s €T (E). 
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For example, the Levi-Civita connection on TM is obviously metric in this sense. 
Furthermore, take E = T*M, and define V* in coordinates through its components by 


Vi Oy := p Oy —T hv Op, (3.61) 


where the Tiy are the Christoffel symbols defined by some (semi) Riemannian metric on M, cf. 
(3.25). This turns out to be a connection indeed (check the axioms), whose rationale (notably of 
the minus sign!) is the Leibniz-type property (or product rule) 


X(0(Y)) = (V%6)(Y) +0 (VxY), (3.62) 
which, omitting the star, may look even more elegant in the form 
Vx(0,Y) = (Vx0,Y)+(0,VxY), (3.63) 


where by fiat we have declared that on functions (such as (0,Y) = 0 (Y )) the covariant derivative 
Vx is simply X, i.e. Vxf = Xf, f € C”(M). Eq. (3.62) or (3.63) might, of course, have been 
used to define V* = V : O(M) > O(M) in the first place, yielding (3.61). In fact, any linear 
connection defines a dual connection V* on T*M by (3.62). 

Combining (3.39) and (3.61), we define a covariant derivative vl). xl) (el) by 


(kL) Pipi — (kl) Pip pı Pi o-p1 pi -0 
(Vi Tu. y= Vee Ty, -- = =a u Tyi- “Vk FIG reer lg ei “Vk 
o “Pl oO Pi, 
=T iy 05 Vp Pa, (3.64) 


Those who do not like coordinate definitions “by formula” may be reassured that V(kl) is the 
unique connection on TIM that, similarly to (3.63), satisfies the Leibniz rule 


Vx (tX. Xp 6,...,6°)) = (VEP 1) (X.X 04,...,04) 
+T(VxX1,...,X601,...,0) + + T(Xi,. Xp 0!,..., V0"), (3.65) 


where the case k = / = 0 is taken to mean vi) = X on £0) (M) = C”(M). Eq. (3.65) 
recovers V0!) = V on €!)(M) = ¥(M) as well as V0) = V* on ¥") (M) = Q(M). 

This construction of V“!) works for any linear connection V. If the latter is the Levi-Civita 
connection, then (3.65) implies that its defining property (3.50) elegantly reads 


Ve) 9 = Vxg =0. (3.66) 
As in (3.66), in general one often writes V for any Vik) and physicists write (3.66) as 


guv;o = 9, (3.67) 
ann Pa semi-colon notation, in which u oe ;„ Means V te 2 much as ae T „ means 


ðu Ty)... As an application, let us show once again that d(||y(r) ||) /dt = 0 for geodesics y: 
Alva)? _ de(d.Y 

dt dt 
where we used (3.65). Eqs. (3.66) and (3.48) then make the right-hand side 0+0+0=0. 


= KEN) = MEIN HEN) +8 Vy), 
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Alternatively, one may recall the description (2.64) of TDM as the tensor product of 


k copies of T*M and / copies of TM. In general, given two vector bundles E (1) > M and 
E?) >M , with connections Vv) and v2, there is a unique connection y(182) on the vector 


bundle tensor product E (1) QE (2) — E & ER that satisfies the product rule 
VCS (s) gs) = VW) (9) @ 52) +s @ VP) (52), (3.68) 


This may be iterated to the tensor product of finitely many vector bundles, and hence (for any 
linear connection V) the connection V’ defined by (3.64) or (3.65) is just the tensor product 
of the individual connections on each copy of TM or T*M present in TH) M. 

It follows from (3.62) that (for any V) the connection VKI) commutes with contraction. 


Contracting the first upper and lower indices and writing 052." = 1,152.5, one has 


k,l ag kl = 
(VEE) y (YED oyge, (3.69) 


and similarly for any other pair of upper and lower indices. In particular, this makes the physicists’ 


notation Ty... u unambiguous. For example, for the Ricci tensor (see §4.5) we have 


If V satisfies (3.52), then V&D in addition commutes with contraction in the metric sense 
explained before (3.15), so that e.g., using (3.67), for the Ricci scalar we have 


Ro = Rig = (8 Ruv):c = 8" oRuv + 8" Ruvic = 8 Ruv:o- (3.71) 
Finally, V!) may be used to rewrite the formula (2.94) for the Lie derivative as 
De Pi = Vy PVP (Vy XV JEEP pb (Vy, XY) OP 

(VPE = (Wp XP) cp 8 (3.72) 

since all Christoffel symbols cancel out (check!).!4° For example, using (3.52) we obtain 
Lxguv = (VuX?)goy + (VvXP)gup = Xv;u + Xu:v- (3.73) 

A vector field X for which “yg = 0 is called a Killing (vector) field.'*’ Eq. (3.73) gives 
Xv;u +Xu;v = 0. (3.74) 
Flows of Killing fields are isometries, that is, diffeomorphisms preserving the metric. In the 


notation of (2.84), this means that yr) g = g, which is usually written as wg = g. By (2.95), 


Killing fields always form a Lie algebra, whose associated Lie group (up to global analytic 
issues) is the subgroup of Diff(M) consisting of isometries. 

In Minkowski space-time (M,n), the Christoffels symbols vanish (at least in the usual 
coordinates), so that V, = ðu and Xu:v = Xu,v. Hence Killing fields satisfy du Xy = —yXp, 
whose general solution is a 10-dimensional vector space (within X(IR*)) with basis 


X(y) = Ov; X(po) = Xp0c — Xo0p; 
Xy) = 6) (v=0,1,2,3); Xpo) (x) = xpð5 =356,. (P,o =0,1,2,3), 85) 


where Xp = Npox°. This is the Lie algebra of the Poincaré-group (which is the subgroup of 
GL4(R) preserving the Minkowski metric n). See also Appendix A, §§A.1 - A.2. 


146 Y is not a connection (as it fails to be C” (M )-linear in X), but £x and Vy both satisfy the Leibniz rule. 
'47Named after the German mathematician Wilhelm Killing (1847-1923), not the movie about Cambodja. 
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4 Curvature 


The notion of curvature was originally introduces by Gauss in the context of lines in IR? and R? 
and surfaces in RÌ. The modern approach via connections is highly abstract (and hence very 
powerful), but we shall recover at least some of the original ideas of Gauss c.s. later on. 


4.1 Curvature tensor for general connections 


For any connection V on a vector bundle E — M, the following map, indexed by X,Y € X(M), 


Q(X,Y): T(E) > T(E); (4.1) 
O(X,Y) := [Vx, Vy] - Vix] (4.2) 


is easily verified to be C*(M)-linear in its argument s € T(E).'*® Furthermore, O(X,Y) is 


C”(M)-linear in X and Y, so that we may equivalently write Q as either of the following maps: 


O.: ¥(M) x X(M) x T(E) 3 T(E); (4.3) 
Ô : Z(M) x ¥(M) x I'(E*) x I(E) + C”(M), (4.4) 


where the first is three times C*(M)-linear and the second four times so; the relationship between 
QO as defined in (4.3) and Q is induced by a pointwise version of the (linear) isomorphism 


Hom(V* x V, R) = Hom(V,V); (4.5) 
9(8,v) = (9(v)). (4.6) 


In the usual basis (0,,) associated to a chart defining coordinates (x") we may write (4.1) as 
[Vu Vv]s(x) = Ouv(x)s(x), (4.7) 


where Quy = Q (ðu, dy) is a linear map E, — Ey. Relative to a local frame (ua) for T(E) in 
which s(x) = s? (x)ua(x), with s? € C” (U), see text after (3.56), we may therefore write 


[Vu Vv]? (x) — OB wy(x)s°(x), (4.8) 
where, switching to the version (4.4), we have the coordinate- and basis-dependent expression 
büy = O( Ou, Oy, ep, @?). (4.9) 


Thus the curvature tensor ©) defined by a connection V has four indices: the first two (i.e. a and 
b) refer to a basis of Ey, whereas the last two (viz. u and v) refer to a basis of 7,M. 
In the case E = TM the distinction between (u,v) and (a,b) is blurred. Our maps become 


OXY): Z(M) RM); Z= ([Vx,Vr]-Vy))Z; (4-10) 
Ô: O(M) x Z(M) x ¥(M) x ¥(M) > C”(M); (0,Z,X,Y) => @(A(X,Y)Z), (4.11) 


where in (4.11) we adopt the convention of moving T (E*) = Q(M) in (4.4) to the front. !4° 


148] follows that (X,Y) defines a cross-section of T(Hom(E,E)). 
'49Regarding a connection as a map V : QOP (E) + OP*!(E), as in footnote 145, the corresponding curvature is 
simply defined as V? : OP (E) — OP*?(E), so that V?u = R Au for some R € Q? (E). 
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4.2 Riemann tensor 


We now fix a metric on M and take V to be the Levi-Civita connection on TM defined by g. 
Denoting Ô by Riem (or R), we obtain the Riemann tensor Riem € X) (M), defined by 


Riem(0,Z,X,Y) = 6(O(X,Y)Z) = 0([Vx,Vy] -Vix y])Z). (4.12) 
In coordinates, where RÈ uv = Riem(dx? , ðo, ðu, dy) and [du,0,] = 0, we therefore have 


Ver s2. (4.13) 


Ropy = ey — Tona Eg Peba ~ ls: (4.14) 


where the Christoffel symbols are defined by (3.25), i.e., this time in Greek indices, 


Tiy = 58°° (gou,v + 8ov,u — guv,o)- (4.15) 


A (semi) Riemannian manifold (M,g) is locally flat (or locally isometric to a flat space) if 
each point x € M has a coordinate nbhd U with a chart ọ : U — R” and associated coordinates 
x! = p(x), see Definition 2.1.2, in which the metric is flat, i.e. guy(x) = Öuv for each x € U in 
the Riemannian case, guy (x) = Nuv in the Lorentzian case, etc. The first nontrivial result about 
the Riemann tensor (which was known to Riemann himself) is that it detects local flatness: 


Theorem 4.1 A (semi) Riemannian manifold (M, g) is locally flat iff Riem = 0, that is, 
ene Uh (4.16) 


One direction is trivial: if Suv(x) = Ôuv (etc.), then the Christoffel symbols (4.15) vanish, so 
that (4.14) vanishes. Proving local flatness from R$ uv = 0 relies on the Frobenius theorem: 1o% 


Lemma 4.2 If Riem = 0, i.e. (4.16), then each x € M has an open nbhd U such that for any 
v € T,M there is a unique vector field Z € X(U) with Z(x) = v and VxZ = 0 for all X EX(U). 


Proof. We just sketch the proof and explain the role of (4.16). In local coordinates (x) the 
condition Vx Z = 0 for all X is equivalent to V,Z? = 0 for all u. One can solve 


Vaz ee =O ZP (xo) = vP, (4.17) 


first for u = 1 at fixed (x?,...,x"), then for u = 2 at fixed (x!,x3,...,.x”), etc. The integrability 
condition [V „, Vy]ZP = 0 for this procedure is satisfied, since by (4.13), this is the same as 
Reel” = 0, which holds by assumption, as in (4.16). 
The thrust of the Frobenius theorem, then, is that the necessary condition (4.16) for the solution 
of all equations VxZ = 0 is also sufficient. To prove the nontrivial direction of Theorem 4.1, 
take an orthonormal basis (e4) of T,M (which exists because g, can be diagonalized at any point 
x € M) and extend this to a frame (e„(y)) defined for each y € U (as in Lemma 4.2), so that 


glx) = ea Vxea = 0, (4.18) 


150This holds for any vector bundle E — M with connection V: if Q = 0, then each x € M has an open nbhd U 
such that for any v € E, there is a unique local section s € T(U,E) with s(x) = v and Vys = 0 for all X € X(U). In 
this generalised version the lemma is proved in e.g. Heckman (2017), Theorem 2.34 
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for all X. Property (3.50) of the Levi-Civita connection then gives X (g(e.,e»)) = 0. Hence 


8y(€a,ep) = gx(easep) = Ôab» (4.19) 


and similarly for other signatures, for all y € U. In particular, the vectors (e4) remain orthonormal 
throughout U and hence remain a basis of each T,M, y € U. Moreover, since V is torsion-free, 


[ea,ep] = Veaeb — Ve,ea = 0—0 = 0. (4.20) 


Using (2.34) and (2.35), this can be used to show that the flows of all vector fields ea commute, 
which in turn implies that there is an open subset V, C T,M such that the map 


t'ealx) > oP 0... 0 (m) (x), (4.21) 


where 9) is the flow of the vector field e, emanating from x (i.e., with initial value gh =x), 
is a diffeomorphism from Vy onto its image U’ C U in M. If the image point of tea under this 


map is y, we then define its coordinates to be (y4 = iA) 4 By construction, €a = 0/ dy“, so that 


8y (Oa, A) = 8y(€asen) = Öab- 


Proposition 4.3 Any torsion-free connection satisfies the Bianchi identities: °? 
O(X,Y)Z+ O(Y,Z)X + O(Z,X)¥ =0; (4.22) 
(VxQO)(Y,Z) + (VyO)(Z,X) + (VzO) (X,Y) =0. (4.23) 
For the Levi-Civita connection, these identities read 
op ab ie olan =O (4.24) 
Rowe Rewer. (4.25) 


Proof. The first one, in the form (4.22) using the definition (4.2), is most simply proved by taking 
commuting vector-fields X, Y, and Z, such as, in coordinates, X = On, Y = dy, Z = ðs, which 
indeed leads to (4.24). One then finds that O(X,Y)Z+ O(Y,Z)X + O(Z,X)Y is equal to 


Vx VyZ — VyVxZ+ Vy VzX —VzVyZ + VzVxY —VxVZzY, 


which vanishes if torsion-freeness (3.47) is taken into account, which means VxY = VyX. 
The second one is usually proved by using geodesic normal coordinates, cf. §5.1. Assuming 
the reader is familiar with these, at the origin of these coordinates the Riemann tensor equals 


Row = 58°" (Ac Ougve = Iydogur F OyOrguo = dudr8gvo). (4.26) 


Since at the origin V-R& uv = IR yvi where V; can even be taken inside the brackets in (4.26), 
the identity (4.25) easily follows. 15° 
The real nature of both Bianchi identities is that they are a consequence of the covariance property 


'51 This construction defines geodesic normal coordinates in the special case at hand, as will be seen in §5.2. 
152Continuing footnote 149, The differential Bianchi identity (4.23) simply reads VR = 0. 
153 Another, more abstract proof of (4.23) follows from Cartan’s exterior calculus and the previous footnote. 
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’ 3 a 
W(3,1)Riemg = Riemy o 


8° (4.27) 


or y*Riem, = Riem, ,, for any diffeomorphism y. Here Wie 1) is defined as in (2.86), and we 
have indicated the dependence of the Riemann tensor on the metric. First, eq. (4.27) reads 


Riem, ((w!)*0, YZ, WX, YY) = Riemy:,(0,2,X,Y). (4.28) 
Using the definition (4.12) of the Riemann tensor, this follows from the underlying property 
VY SY = y (Vix (WZ), (4.29) 


where V£ is the Levi-Civita connection for the metric g. Eq. (4.29), in turn, follows from 
Theorem 3.9, notably from the uniqueness of connections satisfying 


X(y*8(Y,Z)) = y8 (Vx “Y,Z) + y*g(Y, VX “Z). (4.30) 
To see this, one defines a connection V’ by VY = y7! (Viy.x (WZ) ) and shows that 
X(y*8(Y,Z)) = w'a(Vx¥.Z) + W's (Y,VxZ). (4.31) 


Eq. (4.27) is also true for a one-parameter family W, i.e. w Riem, = Riemy*g; taking dy,/ds 
at s = 0 yields both Bianchi identities.!°* The left-hand side equals the Lie derivative ZxRiem,, 
where X is the vector field whose flow is Ws, and this may in term be expressed in terms of 
the covariant derivatives of Riem, using (3.72). The right-hand side may be computed uses the 
techniques explained in §7.2, notably (7.31) and its consequences for Riem and (7.47) - (7.49). 
After lengthy calculations and multiple cancellations, one finds that both derivatives are equal iff 


X*(Rouvie + Rocuiv + Rovesu) = 2(Vu(X "Byeo) — Vv (X"Biro)), (4.32) 
where Bew = Re aw + Rive + Rou, cf. (4.24). Choosing X = 0 at some given point gives 
(Vad BE (Vex Boge: (4.33) 


Taking VX” = ô; and using Bog = —Beyo inherited from Reve = RS uv» cf. (4.13), then 
forces BY. = 0, i.e. (4.24). Putting this in (4.32) and choosing X7 Æ 0 then gives (4.25). 
We can lower the first index of the Riemann tensor to obtain Riem’ € ¥(40) (M), that is, 


Riem’(W,Z,X,Y) = g(W, (O(X,Y)Z)) = g(W, ([Vx. Vr] — Vixy])2); (4.34) 
Baw = SprRouv = Rpoyuv- (4.35) 
We omit the “flat” suffix. This leads to some more identities satisfied by the Riemann tensor: 
Rpovu = —Rpouv; (4.36) 
Ropuv = —Rpouv; (4.37) 
Ruvpo = Roouv: (4.38) 


of which the first is trivial from (4.13) and hence did not require lowering indices, the second 
states that each map O(X ‚Y ) is an isometry of TyM, and the third is conceptually bizarre, since, 
as we explained, the first pair of indices plays a completely different role from the second (and 
yet one can apparently interchange them). Its proof is straightforward from (4.14) - (4.15), but to 
avoid a long calculation one may again want to use using geodesic normal coordinates, in which, 
from (4.26), at the origin one has an expression rapidly yielding (4.38), namely 


Roouv T 1(dodugvp = dydðogup + dyðpguc = Ay Op8vo)- (4.39) 
'54See Kazdan (1981). Einstein’s contracted Bianchi identity (7.56) will be proved separately in 87.2. 
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4.3 Sectional curvature and Theorema Egregium 
All information in the Riemann tensor is in the so-called sectional curvature. Here is the key:!> 
Proposition 4.4 The (pointwise) Riemann tensor Riem, € (T*M)®* is equivalent to a map 
Riem, : APT,M > A2T.M, (4.40) 
which is linear and self-adjoint (i.e. symmetric) with respect to the inner product 
(X1 AX2, Y1 A Y2)x = 8x(X1, Y1 )ex(X2, Yo), (4.41) 
where X NY := (X QY —Y QX). Thus Riem, is specified by the associated quadratic form 
Q: AT,M >R; XAY (X AY,Riem,(X AY))x = Riem,(X,Y,X,Y). (4.42) 
Proof. We first show that Riem, is equivalent to a linear map 
Riem, :7,M@T.M > T,M Q T,M. (4.43) 
1. Recalling (4.1) - (4.2) and (4.10), we have ©,,(X,Y) € Hom(7,M,T,M) by definition. 
2. Linear extension of 0 Qv > (w+ O(w)v gives an isomorphism V* & V = Hom(V,V). 
3. A metric on V gives V* = V canonically (cf. §2.3), so that Hom(V,V) S V QV. 
By the symmetry (4.38), the map (4.43) is self-adjoint with respect to the bilinear form 
(X1 @X2,¥1 @ Yo)x = 8x(X1, Y1 )8x(X2, Y2). (4.44) 


Because of the symmetries (4.36) - (4.38), both the map Riem, and the bilinear form (4.44) 
restrict to the linear subspace A?T,M C T,M & T,M, without any loss of information. 


Explicitly, the map (4.40) is given by linear extension of 
Riem, : du Ady > ge Rp uvðp Ada (4.45) 
It is easy to show that X,Y € T,M are linearly independent iff P.(X AY) #0, where 
PX AY) := gy(X AY,X AY) = gx(X,X)gx(Y,Y) =e 007 (4.46) 
is the square of the (metric) area of the parallelogram in 7,M with sides X and Y, up to a sign. 
Definition 4.5 If P,(X,Y) 4 0, the sectional curvature C,(X AY) of the X-Y plane is given by 


ONY) Riem,(X,Y,X,Y) 
PAY] STIER 


CNY) (4.47) 


155Let V be a (real) vector space. Defining T : V QV — V QV by linear extension of v®w++ w & v, the space 
A?V =V @4V CV QV is the antisymmetric part of V QV, defined as the eigenspace of T with eigenvalue -1. 
Furthermore, if T : W — W is linear and symmetric with respect to some inner product (-,-) on W, i.e., (X, TY) = 
(TX,Y) for all X,Y € W, then the associated quadratic form Q : W —> R is defined by Q(X) = (X,TX). The map 
T may be then be recovered from Q (and the inner product) via the formula (X,TY) = t (Q(X + Y) — Q(X —Y)). 
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The specific combination in (4.47) makes C,(X AY) independent of the choice of X and Y within 
the plane (in TyM) they span, and hence makes C, a function of that plane only. Moreover, 
Proposition 4.4 shows that we may interchangeably use either the Riemann tensor itself or its 
associated sectional curvatures. For an orthonormal pair X = eg, Y = ep we simply have 


Cy (ea,ep) = Riem, (€a, €p, €a, €p)- (4.48) 


We now explain how the notion of sectional curvature is related to the classical differential 
geometry of surfaces, especially through the famous Theorema Egregium of Gauss from 1828. 

The classical theory of surfaces X C IR? was largely based on local constructions. Let U C RÊ 
be open and let F : U — IR? be a smooth map that is a homeomorphism onto its image F (S) = £ 
and also has injective derivatives F}: T „S — Tr(y)M for all u € S (equivalently, F! has rank 2). 


If u = (u!,u?) are the standard coordinates on U, we simply say F (u!,u?) € © C RÌ has 
coordinates (u?,u?), too. This gives three canonical vector fields in RÌ defined on X, viz.!>° 


x = F'(d/du'); X2:= F'(d/du’); N:=xı x X2/ |Xı x Koll. (4.49) 
The vectors X; and ¥ are tangent to &, whereas Nis orthogonal to &. Since the pair (X],X2) is 
a basis of 75 „2, u € U, the triple (X1,X2, N) is a basis of T,R? = R°. Early Greek alphabet 


indices a, B etc. run through 1, 2, whereas i, j,k = 1,2,3. The following two tensors on & go 
back to Gauss (and will be used in a similar way in the PDE approach to GR, see chapter 8): 


1. The first fundamental form is the metric induced by the Euclidean metric ô on R?, ie. 
3, OF! OF! 
Sug = Elda, dg) = Za xg) = Y — -—. 4.50 
gap = &(0a,0g8) = (Xa Xp) L gua Ju (4.50) 
Note that although the (4, d2) basis is orthonormal in U C IR’, its pushforward (¥1,¥2) to 
È may no longer be orthonormal in R3: this depends on the embedding map F. 
2. The second fundamental form or extrinsic curvature (a more telling name!) k of the 
embedding, is constructed as follows. First, for X = X%Xq, € X(%) we define the 3-vector 


VxN = xX*—_., (4.51) 


If X, = XF (u) is tangent to a curve F(y' (t), Y” (t)), then X% = dy*/dt),_9. We may then 
also write VxN(u,v) = dN(y'(t),Y°(t)) /dt)o (the notation Vx is used because from 
a “higher perspective” one uses covariant differentiation with respect to the Levi-Civita 
connection defined by the flat metric 6 on R3). One could also simply say that 


VeN' = X (Nİ) = X% JaN! (i = 1,2,3), (4.52) 
which is (3.36) with vanishing Christoffel symbols (in R3). Since (N ‚N ) = 1, we have 
0=X(12)=X((N,N)) = (VxN,N) + (N,VxN) =2(VxN,N), (4.53) 


so that VxN is orthogonal to N (in IR3), and hence it must be tangent to &. This gives rise 
to the Weingarten map (with a conventional minus sign for historical reasons) 


W:TX > TX; X > -VyN. (4.54) 


156Tnjectivity of F’ implies that the denominator in (4.49) is nonzero. 
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In terms of the Weingarten map, Gauss and Monge defined two curvature scalars, namely 


K = det(W) = kik) (Gauss curvature); (4.55) 
H=tr(W) = Kj +k (mean curvature), (4.56) 


where Kı and K are the eigenvalues of W, as well as the extrinsic curvature tensor, i.e. k, 
k(X,Y) := &(W(X),Y) =—8(VxN,Y) = -(VxN,Y). (4.57) 


It is easy to show that the extrinsic curvature tensor thus defined is symmetric, i.e., 


kKOGY =k X), (4.58) 


which is the same as (VyN,X) = (VxN,Y). To see this, note that (N,X) = 0 (since X and Y are 
tangent to & and hence orthogonal to N), hence 0 = Y((N,X)) = (VyN,X) + (N, VyX). Since 
V (as the flat Levi-Civita connection on R°) is torsion-free, we have VyX = VyY — [X,Y], so 


(VyN,X) = —(N,VyX) = —(N,VxY) + (N,[X,Y]) = -(N,VxY) = (VxN,Y). (4.59) 


Here we also used (N , |X, Y]) = 0, because [X,Y] is tangent to & whenever X and Y are. This 
computation also yields an alterative expression for k, which is manifestly symmetric: 


kap = (ap); Kap = Opta, (4.60) 
where, in terms of F : U — R°, the components x}, g of the vector Xp are simply given by 


; oF 


The relationship between the two curvature scalars and the two fundamental forms is 


K = det(k) /det(g); (4.62) 


H=wu(ek)= YL Fk; (4.63) 
i,j=1,2 


These objects are very useful, if only because they are quite easy to compute in practice: 


e In the simplest case (from which all others follow by translation and rotation), a plane in 
R? is parametrized by (x u', y 1,2 0), i.e., officially, 


F!(u!,u”) =u!; Pu u jew; F3(u!,u?) =0. (4.64) 
The induced metric follows from (4.50) as 211; = £22 = 1, ie. 
= (du! )? + (du?)?. (4.65) 
Eq. (4.49) gives the normal as N = (0,0, 1), which is independent of (u!,u?), so 


ke: (4.66) 


Consequently, H = K = 0, as expected. 
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e The cylinder C, with radius a is defined by x? + y? = a? and z arbitrary, and hence may 
be parametrized by (u!,u?) = (@,z), where @ € [0,27] and z € R, so that 


(x = acosu',y =asinu!,z =u’). (4.67) 
This time the induced metric is 
g=a'dg’+dz’, (4.68) 
whereas the normal N(@,z) = (cos, sin @,0) leads to 
k = —ado’. (4.69) 
Since g!! = g?? = 1 /&?, this gives 


1 
K=0; H=--. (4.70) 
a 


This is a very natural result: the larger a is, the more the cylinder locally approximates a 
plane, whose extrinsic curvature vanishes. 


e Finally, the sphere SŽ is defined by x? +y? +z? = a? and hence we may define 
x = asin 6 cos 9; y=asin@sing; z=acos®, (4.71) 
which of course gives the well-known “round” metric 
g=a'dQ; dQ := dO” + sin? Hdp*. (4.72) 
The normal vector is V(0,@) = (sin 0 cos g, sin sin @,cos 0), which gives 
k =—a~'g = —a(d0? + sin’ 04°), (4.73) 
so that 


K=; H=-.. (4.74) 
a a 


Somewhat anachronistically compared to Gauss, we now define the Riemann tensor Re in 


terms of the metric $ on È as in (4.14), and lower the first index with & as usual.'°’ Since & is 
two-dimensional, there is just one sectional curvature, given, from (4.47), by 


C= C(X1,X2) = Ry212/ det(2). (4.75) 
In slightly modernized form, then, the Theorema Egregium of Gauss states that 
K=C. (4.76) 


Gauss found this theorem remarkable because it equates K, which is a priori defined extrinisically 
through the Weingarten map W and hence through the embedding of X in RÌ, with C, which is 
defined via the intrinsic geometry of % as encoded by its internal metric Sag 


157This, as well as index raising, applies generally to all tensors on È, e.g. Rd = gP kgy- Also, kayb — Opkay- 
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The proof relies on equations that are of independent interest (and will resurface in GR): 


top = PE pty + kupN; (Gauss) (4.77) 

daN = Kö; (Weingarten) (4.78) 

Rep = k kap = Kökay; (Gauss) (4.79) 

kapa + 1? gkys = kart Te kgs. (Codazzi) (4.80) 


where the ies p are the Christoffel symbols (as originally introduced!) associated to the metric 
& on È, and KB = gb ‘kay, where (gP) is the inverse matrix to (gy), as usual. Weingarten’s 
eq. (4.78) is just a restatement of (4.57), and hence is the definition of kag- Gauss’s eq. (4.77) 
is simply the expansion of the 3-vectors ¥gg in terms of the basis (eta ). The specific form 
kap of the coefficient of N immediately follows from (4.60). To derive the coefficient of ¥y, let 
us assume (4.77) for initially unknown coefficients ya B We then obtain 


Optag) = Tag Ent) = Bysl Og, (4.81) 
so that us s gr (Xs Xap). The relation (3.25) then follows from (4.60), which yields 
2(X5- Xap) = dg (X5,Xa,) + Oo. (X, Xp) — ds (Xa, Xg). (4.82) 
The Gauss-Codazzi equations (4.79) - (4.80) then follow from the integrability condition 


i.e., OyXaß = ðßXay. Indeed, the Gauss—Weingarten equations (4.77) - (4.78) give 

so that Gauss’s equation (4.79) is the component of (4.83) tangential (to £), whilst Codazzi’s 

equation (4.80) is its normal component. The Theorema Egregium now follows from (4.79), 

since (4.76) is the same as det(k) = R1212. 
Take the cylinder, whose metric (4.68) is flat. Hence (4.76) is just 0 = 0 (and this is of 

course also true for the plane). The sphere is less trivial; either direct computation or eq. (4.85) 

and Theorem 4.8 in the next section show that R1212 = 211222/ a = a? sin? 0, so that, with 


det(£) = af sin? 0, we find Rı212/ det(g) = 1/a*, which, given (4.74), confirms (4.75) - (4.76). 


Finally, we return to the interpretation of sectional curvature in general (semi) Riemannian 
geometry. In §5.2 we will see that each x € M has a so-called normal neighbourhood U, that is 
diffeomorphic to some subspace %, C TyM through the exponential map exp, : V, — M. Take 
linearly independent vectors X,Y € T,M with associated plane span(X,Y) C T,M, and consider 
the two-dimensional submanifold Uy y = exp,(span(X,Y)N%) C U, of M; note that Èy y is 
spanned by geodesics emanating from x that have tangent vectors in span(X,Y). This surface 
has an intrinsically defined Gaussian curvature K, which, at x, by the Theorema Egregium is 
just its sectional curvature C,(X,Y). It follows that, through its associated sectional curvatures 
(which in turn define it), the Riemann tensor gives the Gaussian curvatures K of all possible 
two-dimensional subspaces of M. Conversely, these quantities give a complete description of 
the Riemann tensor. Its original definition (4.12) through the covariant derivative, which is very 
abstract, therefore has an interpretation in classical two-dimensional differential geometry. 
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4.4 Spaces of constant curvature 
As another take on sectional curvature we now turn to the important case where it is constant: 


Definition 4.6 A connected semi-Riemannian manifold (M,g) has constant curvature if all 
sectional curvatures Cy (X Y ) coincide (where x € M and X,Y € T,M vary). 


We assume n := dim(M) > 2. If n > 3, and C,(X,Y) is independent of X and Y for each x, then 
the common value of C,(X,Y) is also independent of x, so that (M, g) has constant curvature. 158 


Proposition 4.7 If (M,g) has constant curvature, then the Riemann tensor (4.14), the Ricci 
tensor (3.14), and the Ricci scalar (3.15)-see also $4.5-are given by, respectively, 


Rijki = k(girg jl — BiB jk); Rij = (n — 1)kgij; R=n(n-1)k. (4.85) 
where k is the common value of all sectional curvatures, called the curvature of (M,g). 


Proof. Let C,(X,Y) = k(x) for all X,Y € T,M and some k € C*(M). In terms of the tensor 
S.(V,W,X,Y) = 8x(V,X)g&x(W,Y) —gx(V,Y )ex(W,X) (4.86) 


of type (4,0), eq. (4.47) gives Riem, (X,Y,X,Y) = k(x)S(X,Y,X,Y ). But since the Riemann 
tensor is completely defined by its sectional curvatures, this implies Riem, = k(x)S. 
In n = 2 this just means that the scalar curvature is constant. Definition 4.6 becomes increasingly 
stringent in higher dimension, as TyM contains an increasing number of plane whose sectional 
curvatures has to be constant, but this is balanced by the larger variety of possible manifolds and 
metrics, so that the classification is the same for any dimension n > 2. Even the Riemannian and 
the Lorentzian cases look strikingly similar, as we shall see. We start with the former. 15° 


Theorem 4.8 [fn > 2, any (geodesically) complete and simply connected Riemannian manifold 
(M, g) with constant curvature k is isometrically isomorphic to one of the following spaces: 


e k= 1/p° > 0: The n-dimensional sphere Sp with radius p > 0 in Reale 
n+1 
ea |. (4.87) 
i=1 
with metric inherited from R"*! with Euclidean metric 6(X,Y) = 21} X'Y!. 
e k=0: The n-dimensional Euclidean space R” with metric (X,Y) = Y% X'Y'. 


e k=—1/p? <0: The n-dimensional hyperboloid H} in R"*! with label p > 0 defined 
by 


H= | et ER"! | -n+),7=-P’,%0> o}, (4.88) 
j=l 


with metric inherited from IR"*! with Minkowski metric n(X,Y) = —X°Y°+y%_,X'Y". 


In both Sp and Hy, the geodesics are the intersections with a plane in IR"*! through zero. 


'58See footnote 660 in $A.5 for a proof, or Corollary 2.2.7 in Wolf (2011), which is a classic on spaces of constant 
curvature in any signature. For the Riemannian case see also the beautiful treatment by Vinberg er al. (1993). 
159For us, saying that M is simply connected also implies, by convention, that M is connected. 
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In n = 2, where these spaces were first discovered, !6 H5 is often realized as the Poincaré disc, 


D? = {(u,v) E R? | W +v? < 4p°}, (4.89) 

equipped with the metric (which is already given by Riemann in his 1854 Habilitation lecture) 
du? + dv? 

ds’ =4- ’ 4.90 

STH) en) 

where k = —1/p?. The Poincaré disc has been turned into art in a famous woodcut by Escher: !®! 


ee 


Circle Limit IV (Heaven and Hell) by M.C. Escher, showing the Poincare disc 


Independently of Bolyai and Lobachevskii, Riemann probably found the hyperbolic metric as 
follows: stereographic projection of s? from the north pole onto the z = 0 plane in R3, i.e. 
px py 
x,y,z) > (u,v,0); u = ; ya 
(63,2) => (u,,0) a as 
where (x,y,z) # (0,0,p), gives the same metric (4.90), but this time with k = 1/p*. However, 
the later model (4.88) has the advantage that (for n = 2) its geodesics are simply the intersections 
of H with planes in IR? through the origin, exactly as for 5 (giving the great circles). 


; (4.91) 


160 As mentioned in the historical introduction, the 2d hyperbolic spaces were independently discovered by Bolyai 
and Lobachevskii in the 1830s and caused a revolution, in that Euclidean geometry no longer provided an absolute 
source of truth in mathematics, so that eventually the link between mathematics and reality came to be dropped. For 
k < 0, the need for something like an embedding in Minkowski space arises because Hilbert (1901) proved that it is 
impossible to isometrically embed D? with its hyperbolic metric in R°, equipped with its usual (Euclidean) metric. 

161 Copyright: The M.C. Escher Company, Baarn. See also Wieting (2010) and footnote 486. 
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In order to discuss the Lorentzian counterpart of Theorem 4.8 we need the Lorentzian cover of a 
non-simply connected Lorentzian manifold (M,g): this is the universal cover M of M equipped 
with the pullback metric ë = 7*g of the covering projection 7: M — M. This complication was 
not necessary in Theorem 4.8, since both Sp = S” and A, = R” are simply connected for n > 2. 


Theorem 4.9 [fn > 2, any (geodesically) complete and simply connected Lorentzian manifold 


(M, g) with constant curvature k is isometrically isomorphic to one of the following spaces: !° 


e k=1/p? >0: Forn> 2, the n-dimensional de Sitter space dSp with p >O in RHL, ie. 


n 
ds = oa a R y (4.92) 
ll 


N 


with metric inherited from IR"*! with Minkowski metric n(X,Y) = —X°Y? + X% X'Y'. 


~ 


For n = 2, however, one should use the Lorentzian cover 53 (with associated metric). 
e k= 0: The n-dimensional Minkowski space-time (M,, nn), i.e. R” with metric (3.3). 
e k=—1/p* <0: The Lorentzian cover AdS”, of anti de Sitter space with p > 0, i.e. 
n—1 
AdS = e o e a a E R a no E Ea y (493) 
i=1 
with metric inherited from R"*! with metric y (X,Y) = —X_1Y_ı — X?Y? + ya XY. 


In both dS and AdSọ, the geodesics are the intersections with a plane in R”+! through zero. !® 


The proof of Theorems 4.8 and 4.9 is very long,'°* but below we sketch the main argument. This 
uses some Lie group theory. We first comment on the similarity between Theorems 4.8 and 4.9. 


162De Sitter space was introduced by de Sitter (1917ab), and independently, along with anti de Sitter space, by 
Levi-Civita (1917b). De Sitter’s papers were a response to Einstein (1917b), the paper that launched relativistic 
cosmology (and, one could say, theoretical cosmology altogether). This, in turn, arose from earlier conversations 
and correspondence between Einstein and Willem de Sitter (1872-1934), who was a prominent Dutch astronomer 
(based at Leiden) and also an accomplished mathematician. Einstein (1917b) is also (in)famous because Einstein 
introduced his cosmological constant A in it. In the early parts of the paper he paves the way for A by claiming 
(incorrectly) that it solves the well-known paradox in Newtonian cosmology that if matter is distributed uniformly 
in an infinite universe, the gravitational force at each point is infinite. But his real purpose was to rescue Mach’s 
principle in the context of his new theory (see footnote 24). De Sitter invented his model (which is a solution to 
Einstein’s vacuum field equations with positive cosmological constant) to (successfully) challenge this, which led 
to an interesting and historical significant debate (Smeenk, 2014). Because of the discovery, at the very end of 
the twentieth century, of an accelerated expansion of the universe (Kirshner, 2002), which requires A > 0 (now 
reinterpreted as “dark energy” and usually called A, as in the “ACDM Standard Model of Cosmology”) it is now 
widely believed that we approximately live in a de Sitter universe. See also Kragh (2007) and Nussbaumer & Bieri 
(2009). The popularity of anti de Sitter space has also exploded after the discovery of the AdS/CFT correspondence. 
Useful references on (anti) de Sitter space range from Hawking & Ellis (1973), §5.2, to Moschella (2005). 

16 Instead of the ones in Theorem 4.9, also here one has disc-like realizations of these spaces, which are simply 
obtained by replacing the Euclidean metric du? + dv? in (4.90) with the Minkowski metric du? — dv’, and similarly 
in higher dimension. However, the realizations of Theorem 4.9 are more widely used in GR. 

'64Both theorems are a special case of Theorems 2.4.4 and 2.4.9 in Wolf (2011). Their common generalization is 


(ntl) = ô. For 1 < s < n, take the indefinite metric 


as follows. For s = 0, equip R”+! with the Euclidean metric & 
a) (X,Y) = =e er For p > 0, define S° C IR"t! as a quadric -Y}_,x? Eu =p. 
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Namely, the assumption of (geodesic) completeness is much more natural in the Riemannian 
setting than it is in Lorentzian geometry, where it is violated in many realistic space-times (see 


chapter 6). Instead, a more natural completeness assumption would be global hyperbolicity, cf. 


§5.7. In fact, de Sitter space is globally hyperbolic but anti de Sitter space is not. See §5.10. 
The need for the topological covering construction comes from the diffeomorphisms 


dS = Rx SS"); (4.94) 

CETE xo, 2 (4.95) 
To fp2+22 

Aas, S81 xR’; (4.96) 

62ER. 7100 5 DENE: SB 1) 3 An—1 . (4.97) 


xo ia 
er a Tue 


so that ds? = R x S! and hence N = R?, and, for any n > 2, we obtain AdS”, = pR”. 

Given Theorems 4.8 and 4.9, any other complete Riemannian or Lorentzian manifold with 
constant curvature can be constructed from the above spaces by forming quotients of M by 
discrete subgroups T of the isometry group of (M, g) that act freely and properly discontinuously 
on M.!®5 In particular, the 2d de Sitter space ds? has constant curvature k = 1/p?, and for any 
n > 2 the multiply connected anti de Sitter spaces Ad Sp all have constant curvature k = —1 /p*. 


Finally, for those familiar with Lie groups, we reformulate Theorems 4.8 and 4.9 in those 
terms. Spaces of constant curvature (and many other interesting Riemannian or Lorentzian 
manifolds with less symmetry) can be realized as homogeneous spaces (or coset spaces). See 
Appendix A; we will restrict the discussion here to the points of direct interest. 

An isometry of a metric g on M is a diffeomorphism @ of M such that 


“g =g 5 Eo (P(X) )) = 8X, Y) Vx € M,X,Y € TM. (4.98) 


The set of all such diffeomorphisms @ is the isometry group of (M,g), denoted by Iso(M,g). 


This is by definition a subgroup of the “infinite-dimensional” group Diff(M), but it can be shown 
that Iso(M, g) is a finite-dimensional Lie group in the compact-open topology.!°° 

Let G be some subgroup of Iso(M,g) and suppose that G acts transitively on M (i.e. for each 
x,y € M there is y € G such that y = yx). Choosing some fixed x’ € M with stabilizer 


H ={yEG| yx =x}. (4.99) 


with the metric induced from gn. and let 45” C IR"t! be the quadric - Pst} x? +E") —p?, with the 


i=1* J a = ~ 
metric induced from gl" 
curvatures k = 1/p? and k = —1 / p°, respectively. In particular, dSp = Se and AdSp = H}. Complete this list 
( 


with the k = 0 case in signature (s,n — s), which is obviously R” with metric g; for s = 1 this is Minkowski 
space-time. Passing to the universal semi-Riemannian cover for Sp” ifs =n-— 1 and for Ho” if s = 1, up to isometry 


n41) Then Sp s and Hp * are semi-Riemannian manifolds of signature (s,n— s) with constant 


these are all complete simply connected semi-Riemannian manifolds of signature (s,n— s) with constant curvature. 


In these realizations, the geodesics are once again the intersections with a plane in IR’*! through zero. 
165We say that T acts freely on M if yx = x implies x = e, and properly discontinuously if each x € M has a nbhd 
U such that the set {y €T | y(U)NU #9} is finite; in particular, T-orbits cannot have any accumulation point. Wolf 
(2011) contains a complete solution of this problem, which is already very substantial for the hyperbolic space H$. 
166See e.g. O’ Neill (1983), Theorem 9.32. The compact-open topology on a space of maps F : X — Y is generated 
by open sets of the form Cx,y = {F | F (K) C U}, where K is compact in X and U is open in Y. 
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This is a closed and hence Lie subgroup of G, and we obtain a diffeomorphism 
w:M G/H; w(yx') = YH. (4.100) 


If we define the canonical action of G on G/H by y(YH) = (yY)H, then y is equivariant, in 
that w(yx) = yy (x). We then equip G/H with the unique metric g’ = (w~!)*g that makes w 
an isometry, namely g = y*g’. The above G-action on G/H is then transitive and isometric. 
Conversely, we start with a Lie group G and a closed subgroup H C G and study possibly 
G-invariant metrics on G/H. This is done in See Appendix A, with the following conclusion: 


Proposition 4.10 1. There is a bijective correspondence between G-invariant metrics on 
G/H and Ad'(H)-invariant metrics on g/b, and hence, if g = h Dp (see below), on p. 


2. There is a unique G-invariant metric on G/H (up to scaling by a positive constant) iff the 
Ad’(H)-action on g/b (or, if applicable, on p) is irreducible. 


Here g and h are the Lie algebras of G and H, respectively, with h C g. Any group G acts on itself 
by the adjoint action Ad,(ö) = yéy_|. If Gis a Lie group, this action defines a representation 
Ad’ of G on its Lie algebra g, defined by Ad,(X) = YX y~! (this notation is justified since in 
Appendix A we define Lie groups and their Lie algebras as matrices). This action may, of 
course, be restricted to H C G, and it is easy to see that this restriction quotients to g/h. In our 
application to spaces with constant curvature, the vector space g has a canonical decomposition 


g=5 Op, (4.101) 


where (trivially) not only h, but also p is invariant under Ad), for any h € H (if H is connected, 
this invariance requirement is equivalent to [h,p] C p). This is meant in Proposition 4.10.1. 
The proof of Proposition 4.10 is based on two ideas. First, a G-invariant metric g’ on a 
homogeneous space G/H is determined by its value at any given point of G/H, for which one 
takes H (seen as the equivalence class of e € Gin G/H). Second, one has an isomorphism 


Tu(G/H) = g/b, (4.102) 


which is both linear and H-equivariant, in that the linear H-action on Ty(G/H) coming from the 
G-action on G/H is mapped to the H-action on g/b mentioned above.!®’ Thus any G-invariant 
metric g’ on G/H is determined by its value gy at H € G/H, i.e. by a metric on the tangent 
space (4.102). This metric is still constrained by G-invariance, whose “infinitesimal shadow” at 
H is the adjoint H-action Adj, on g/b. If this shadow is sufficiently large, g/, is even determined 
by Ad/-invariance (up to a constant scale factor). In words, g’ is both homogeneous, i.e. “the 
same” if one moves from point to point by the G-action, and, if the second part of Proposition 
4.10 applies, also isotropic in being “the same” in all directions from a given point of view. This 


uniqueness applies in particular to spaces of constant curvature, to which we now return. 
Let O(k,/) C GL(k+1,R) be the isometry group of the metric g = diag(— = —,+ Si +) 


on IR**!; elements of O(k,/) are matrices y € GL(k +1, R) that satisfy Y" gy = g. We will be 
interested in k = 0, k = 1, and k = 2 and write O(/) for O(0,1). Then the following holds: 


167Let Ln (yH) = (hy)H be the restriction of the G-action on G/H to h € H. Then the pushforward L}, maps 
Ty (G/H) to Tiny (G/H), and so for y = e we obtain a linear map Ly : Ty (G/H) > Ta (G/H). 
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e O(n+ 1) acts transitively on Sp. The stabilizer of (0,...,0,p) € Sp is O(n), seen as a 


subgroup of O(n + 1) under the obvious “upper left corner” embedding. Hence 
Sp = O(n+ 1)/O(n), (4.103) 


first as manifolds. But if O(n + 1) /O(n) is equipped with p? times its canonical O(n + 1)- 
invariant metric (see below), then this diffeomorphism is promoted to an isometry. 


e O(1,n) acts transitively on Hy. This time, take (p,0...,0), whose stabilizer is again O(n), 
embedded in the “lower right corner” of O( 1, n). Thus, with similar comments, 


Hy = O(1,n)/O(n). (4.104) 
e O(1,n) also acts transitively on de Sitter space dS). Returning to (0,...,0,p), we obtain 
dS" = O(1,n)/O(1,n—1). (4.105) 
e O(2,n— 1) acts transitively on anti de Sitter space AdS% . Taking (p,0...,0) yields 
AdS" = 0(2,n-1)/O(1,n-1). (4.106) 
e Finally, for the flat Euclidean and Minkowski spaces we have 


R” = E(n)/O(n); (Euclidean) (4.107) 
IR" = P(n)/O(1,n— 1); (Minkowski), (4.108) 


where the semidirect product E(n) = O(n) x IR" is the Euclidean group in dimension n, 
and likewise P(n) = O(1,n — 1) x IR" is the Poincaré group in dimension n. These are 
the isometry groups of the Euclidean metric and the Minkowski metric on IR”, respectively. 


In the Riemannian case the denominator is always H = O(n), whereas in the Lorentzian case 
it is the (Lorentz!) group H = O(1,n—1). It turns out that case 2 of Proposition 4.10 applies: 
in fact, in both cases we have g/b = R” and under this isomorphism the adjoint H-action is 
simply given by the defining action of H on IR” (which is certainly irreducible). Thus G-invariant 
metrics on all of the above spaces G/H are unique (up to scaling), and hence “canonical”. 


Corollary 4.11 [fn = dim(M) > 2, then the following list (where each space G/H is equipped 
with its canonical G-invariant metric) gives all complete and simply connected spaces M of 
constant curvature, up to isometry and up to rescaling of the metric by a positive constant: 


e Riemannian i) k > 0: O(n+1)/O(n). ii) k =0: E(n)/O(n). iii) k <0: O(1,n)/O(n). 


e Lorentzian i) k > 0: O(1,n)/O(1,n—1) (for n = 2 one needs its Lorentzian cover). 
ii) k=0: P(n)/O(1,n—1). iii) k < 0: the Lorentzian cover of O(2,n—1)/O(1,n—1). 


Finally, realizations of homogeneous spaces as G/H are not unique; one may have G’/H’ S 
G/H (think of (G x G)/G = G/{e}). As a case in point, all of the above groups are discon- 
nected: O(/) has two (connected) components, of which SO(/) is the one containing the identity, 
and O(1,/) and O(2,/) even have four. The above way of writing down the isomorphisms has 
the advantage that O(n + 1) is the full isometry group of Si, and likewise O(1,n) for both Hf 
and dS%, and O(2,n—1) for AdS. However, each isomorphism is also true if both groups in 
the quotient are replaced by their identity components, e.g. S = SO(n + 1) /SO(n), etc. 
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4.5 Riccitensor and Ricci scalar 


The Riemann tensor contains all information about curvature. There also exist weaker measures 
of curvature. The main actor in GR is the Ricci tensor, which like the metric has type (2,0): 
n 
Ric(X,Y) := ) Riem(e4,X,ea,Y ); Ric; = Rij = Ri; = g" Reit;. (4.109) 
a=1 
where (e4) is any orthonormal frame.!°® Note that Ric is symmetric by (4.36). From Ric, we 
define the scalar curvature by 


R:= L Ric(ea,ea) = L Cleaen) = g R;j, (4.110) 


where of course in the second sum the terms a # b do not contribute and hence due to symmetry 
the sum just has (n? — n) /2 terms. For example, in n = 3 the Ricci scalar (at a point x) is the 
average of the sectional curvatures of the x-y, x-z, and y-z planes (within the tangent space TyM). 

Furthermore, the Ricci tensor defines two Einstein tensors, most easily by their components 


Gij := Rij — ż8gijR; (4.111) 
1 
By Rym BR. (4.112) 
Physicists use G;j because, as will be explained later, it emerges from the calculus of variations 


applied to the functional g++ fy, R(g). Mathematicians, on the other hand, use £;; because it is 
simply the traceless part of Ric (note that g'/E;; = 0). Moreover, to explain the name, suppose 


Ric = Ag; Rij = Agij, (4.113) 


for some constant A € R, in which case we say that (M,g) is an Einstein manifold, and that g 
is an Einstein metric. Then R = À -n is constant and A = R/n, so that (4.113) implies Ej; = 0. 
In d > 2, also the converse is true;!° this follows from the Bianchi identity (4.25). Thus: 


Proposition 4.12 For n > 2, a metric satisfies (4.113) iff its Einstein tensor (4.112) vanishes. 


The symmetries (4.36) enable one to count the number of independent components of the 
Riemann tensor in various dimensions n, namely n?(n? — 1) /12 (check!). Therefore: 


1. For n = 2 the Riemann tensor has just one independent component R1212, and also 


11 12 1 _ 
—1 &§ &§ 822 812 
= = ‘ 4.114 
j ( gl g” ) det(g) ( —812 gu ) ( ) 
so that the Ricci tensor Rj; = g Rei; must equal R;j = gijRı212/ det(g). This gives 
Rija = 4R(8ik8 jt — BiB jk); Rij = 38ijR, (4.115) 


cf. (4.85). Hence R1212 = 4 det(g) -R = det(g) - K, where the Gaussian curvature K is 
given, either as a definition or as a theorem,!”° by one of the equivalent expressions 


K= C(01,02) = Rı212/ det(g) = SR. (4.116) 


168 Authors use various sign conventions for the Riemann tensor, but all Ricci tensors and scalars coincide. 
169 We will shortly see that E;; =0 in d = 2, where we know since Gauss that non-constant R is certainly possible. 
170See §4.3, as well as e.g. Heckman (2017), Theorem 3.15. 
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2. For n = 3, the Riemann tensor has 6 independent components, as does the Ricci tensor! 
So these two must carry the same information.'’! This can be understood from linear 
algebra, as follows. If V has an inner product, any symmetric bilinear map T : V & V > R 
is equivalent to a self-adjoint linear map T : V > V via T(v@w) = (v, w). In particular, 
the Ricci tensor Ric, :7,M®T,M > R at a point x € M is equivalent to a linear map 


Ricx : T.M > TM; gx(X,Ric,Y) = Ric, (X,Y). (4.117) 
If dim(V) = 3, then dim(A?V) = 1, and any nonzero Vol € AV gives an isomorphism 
AV SV" AWA; AAv=A(v)Vol. (4.118) 


This follows from a dimension count: if V has a basis (e1,e2,e3), then A?V has a basis 
(e1 A €2,€2 \ €3,€3 ^ei). One may then take @ =e) ^ e2 ^e3, take the basis (œl, œ, w°) 
of V* dual to the basis of V (i.e. œ' (ej) = ô;), so that if A = Ae; ^ej € A2V, we find 


Âi = EA. (4.119) 


Here &;;x is the totally antisymmetric (Levi-Civita) symbol, that is, Âi = A”, Â = 
—A!, and A; = A!?. Dually, V & (A?V)* = A?V* under v > 9, where Pij = Ejja". 
Consequently, in n = 3 (only!), one has A’T,M Z T*M = T,M. This isomorphism also 
makes linear maps A?T,M > A2T,M and T,M > T,M equivalent, so that the maps (4.40) 
and (4.117) are essentially the same. If the Ricci tensor as in (4.117) is diagonalized by 
an orthonormal basis (e1,e2,e3) of T,M with eigenvalues (A],A2,A3), then the Riemann 
tensor as in (4.40) is diagonal with respect to the basis (e1 A e2,e2 Ne3,e3 ^ e1) of XTM, 
with eigenvalues (Aı + Az — A3, A2 + A3 — A1,Aı — A2 + 23). The Ricci scalar is given by 


R, = ài + A2 + 23. (4.120) 
Though derived from the Ricci tensor, an interesting tensor in n = 3 is the Cotton tensor 
Cijk = VkRij — V jRik + 4 (8ikV jR — ĝijV R). (4.121) 


Much as the Riemann tensor detects if a space(-time) is flat, cf. Theorem 4.1, the Cotton 
tensor detects conformal flatness. First, C is invariant under rescalings g > Q?g (i.e. 
conformal transformations), where Q € C” (M) is strictly positive. Since it vanishes for 
flat metrics, it then also vanishes if g = 76 (or Q?n). The converse can also be proved. 
Hence: a 3d space or space-time is conformally flat iff its Cotton tensor vanishes.” 


3. For n = 4 (the case of interest to physics), the Riemann tensor has 20 independent 
components, whereas the Ricci tensor only has 10. The geometric information in the 
Riemann tensor that is not passed on to the Ricci tensor is contained in the Weyl tensor 


Wen; = Reig + (ei pRiajr + 8pR je) + 5 (R - 8x8 sr) (4.122) 


where [---] antisymmetrizes the enclosed indices (e.g. SK Riı = Sujit — Reig ji). The Weyl 
tensor (“Weyl”) has the same symmetries as the Riemann tensor, cf. (4.36) - (4.38), so that 
Weyl also has 10 independent components, like Riem. Everything just said in n = 3 about 
the Cotton tensor is now valid in n = 4 for the Weyl tensor (which vanishes in n = 3). 


product (P © Q) jj := PQ jk + Pik Qi — PQ ji — Pj Qix, this reads Riem = 7R(g Og) +E ©g, cf. (4.112). 
12 The original reference is Cotton (1899), see also Eisenhart (1926), §28. A modern treatment is Garcia et al. 
(2004). There is no analogue of the Cotton or Weyl tensors in n = 2, since every 2d metric is conformally flat. 
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4.6 Submanifolds and hypersurfaces 


As we saw in $4.3, differential geometry started with the study of two-dimensional submanifolds 
> of IR? (i.e. surfaces) by Gauss. In GR, a crucial role will be played by various submanifolds M 
of a four-dimensional Lorentzian manifold M. One may define a submanifold M of a manifold 
M in two equivalent ways: either as a subset M C M of M with certain (good) properties, or 
as a manifold M in its own right (a concept already defined, of course) plus an explicit map 
F : M — M with equivalent good properties. The former leads to the latter by considering the 
inclusion map F : M — M, whereas the latter leads to the former by identifying M with its image 
F(M) C M (which may lead to some confusion!). We put n := dim(M) as usual. 


Definition 4.13 If M is a manifold, a map F : M — M defines a submanifold F (M) C M iff: 
1. F is ahomeomorphism onto its image F(M). In particular, F is injective. = 
2. Fi: TuM — Typ „M is injective for all u € M. Equivalently, the rank of F, equals dim(M). 
Equivalently,''* a subset M C Mis a k-dimensional submanifold of M iff each x € M has an 
open nbhd U in M for which there is chart ọ : U =V C R” (for M) whose image takes the form 


g(UNM) = ¢(U)NX, (4.123) 
where X is a k-dimensional affine linear subspace of IR". Ifk =n—1, M is called a hypersurface. 


Until the end of this chapter we assume M is a hypersurface, i.e. dim(M) = n—1. If M carries a 
metric tensor g, then, generalizing (4.50) in §4.3, M inherits a—not necessarily metric!—tensor 


g:= 1g; ge x?) (M), (4.124) 
defined by the inclusion 1 : M <> M. Identifying M with ı(M), this simply means that 
Gul Nts) = 2; Y;), (4.125) 


for any Xy, Y, € T,M C T,M, with x € M. It is easy to see that if (M, g) is Riemannian, then so is 
(M, g). But in the Lorentzian case the induced “metric” & need not be non-degenerate. 


Lemma 4.14 Let g: V x V > R be a symmetric nondegenerate bilinear map on a real vector 
space V. Then for any linear subspace W C V, with W+ := {vEV | g(v,w) =0VweW}, 
dim(W) + dim(W+) = dim(V); (4.126) 
(Wt)+ =W. (4.127) 
For the proof see O’ Neill (1983), Lemma 2.22. Taking V = TyM and W = T,M, this yields 
dim((T,M)*)=1. (4.128) 


Hence at each x € M one has a normal (vector) N, € (T,M)* C T,M i.e. g(N,,Xx) = 0 for all 
X, € T,M, which by (4.128) is unique up to scalar multiplication (but we assume N, Æ 0). 
If (M,g) is Riemannian, then we may normalize each N, such that 


8x(Ny, Nx) =1. (4.129) 


'3Dropping this condition defines an immersed submanifold; what we define is an embedded submanifold. 
74 See e. g. Andrews (undated), Proposition 3.2.1, combined with Proposition 1.31 in O’ Neill (1983). 
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This still yields no canonical choice of N,, but any two choices only differ by a sign and we 

assume that we can make a smooth choice x ++ N, throughout M, which is always Riemannian. !”> 
The Lorentzian case is much richer. A hypersurface may not fall into any of the three classes 

below since the sign of gy (Nx, Nx) may change with x, but let as assume it is fixed (or zero). 


Definition 4.15 A hypersurface M C M, with nonzero normal vector field N, is called: 


e spacelike iff gx(Nx,Nx) < 0 for each x € M. This is the case iff the induced metric gm Is 
positive definite, so that (M, 8| aq) is a 3d Riemannian manifold. 


e timelike or Lorentzian iff g,(N,,N,) > 0 for each x € M. This is the case iff the induced 
metric g\y is Lorentzian (obviously with signature (—++)). 


e null iff g.(Ny,Nx) = 0 for each x € M. This is the case iff gm Is degenerate. 
To explain the last remark, we first note that in the null case, N, is both normal and tangent to its 
null hypersurface. To see this, take W = T,M in Lemma 4.14, so that W+ = R-N,, whence 
TM = (TxM~)* = (IR: Ny)”. (4.130) 
Hence N, € T,M, but by definition of a normal, there exists no X, € T,M for which g(Nx, Xx) #0, 
so that gjy is degenerate. Furthermore, whereas timelike normals are usually normalized as 
&x(Ny,Nx) = —1, (4.131) 


and spacelike normals usually satisfy (4.129), in the null case the normals N, lack a natural nor- 
malization. To soften this, note that 7,M cannot contain any null vector N! linearly independent 
of N, (for in that case gx(N,,N!) = 0, which would contradict the Lorentzian signature of g). 
Therefore, we can find a second null vector field N, € TyM (pointing outside T,M) such that 


8x(Nx,Ny) = —1. (4.132) 
Lemma 4.16 If M is null, any X, € T.M is either proportional to N, (hence null), or spacelike. 


Proof. Suppose T,M contains a timelike vector T,; then g(7,+AN,) = g(T,,T,) < 0 for all A, 
but a computation in coordinates shows that any sum T, + AN, of a timelike vector and a null 
vector becomes spacelike for large A. The claim follows by the argument after (4.131). 


This is important, because it shows that null hypersurfaces have a canonical lightlike direction 
given by its normal (!); see §5.3 and §6.3 for further discussion, especially Proposition 6.9. 

There are two basic examples of null hypersurfaces in Minkowski space-time (IM, n). On the 
one hand, we have null hyperplanes such as u := t — r or v := t +r constant, or more generally 
the set of all vectors orthogonal to a given null vector (and translates thereof). On the other hand, 
we have forward or backward lightcones, see §5.3. In GR, in the context of black holes event 
horizons and Cauchy horizons are null hypersurfaces (see §10.7), and null hypersurfaces also 
play an important role in the settting of Penrose’s singularity theorem (see $6.3). 

Spacelike hypersurfaces are also very important in GR, especially Cauchy surfaces; the 
simplest example in M is x? = constant. Similarly, for a timelike hypersurface we may take 
x' = constant for i= 1, 2, or 3. A more spectacular example in GR is the photon sphere in 
Schwarzschild space-time (see §9.2), and also “naked singularities” are timelike (see chapter 9). 


175 A sufficient condition for this, i.e. triviality of the normal bundle, is that M be connected and simply connected 
(Kobayashi & Nomizu, 1969, p. 5). Since the criteria in Definition 4.15 are independent of the sign of N,, by Lemma 
4.14 the classification in this definition is even well defined if no continuous choice x ++ N, exists on M. 
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4.7 Gauss-Weingarten and Gauss-Codazzi equations 


Let M C M be a hypersurface with normal N; if M is Lorentzian we assume M is spacelike (this 
case is fundamental to GR). The orthogonal projection from T,M onto T,M is given by 


Tx: TM > TM C TM, (4.133) 
Tx (Xy) = Xx — Bx(Xy,Nx)Ny; (Riemannian case); (4.134) 
7.(X,) = Xr + 8x(X,N,)Ny; (Lorentzian spacelike case), (4.135) 


so that T(N) = 0 and Ty (Xx) = X, if X, € T,M (this projection is independent of the choice of 
N,). Then (M, g) and (M, g) each have their Levi-Civita connections V and V, respectively. 


Proposition 4.17 The connection V on M is related to the connection V on M by 
n(VxY)=VxY (X,Y € X(M)). (4.136) 


Here the covariant derivative VxY on the right-hand side is clearly defined (as an element of 
x (M )), but also the covariant derivative VxY in M on the left-hand side is well defined, even 
though Y is merely a vector field on M rather than on all of M: as in the comment preceding 
(3.40), if X € X(M) and Y € X(M), then the value of VxY only depends on the restriction of Y 
to M (indeed, it only depends on the values of Y along the flow lines on X, which lie in M), and 
so VxY is defined (as a vector field on M) even when Y € ¥(M).!7° 

Proof. We write VY for 2(VxY), so that (in the Lorentzian case for simplicity) 


VY = VxY+g(VxY,N)N. (4.137) 


We first check that V’ is a covariant derivative on X(M). Linearity in Y is obvious (since both g 
and Vy are linear), as is C” (M)-linearity, cf. (3.32). The Leibniz rule (3.33) follows from the 
corresponding rule for V and the property g((Xf)Y,N) = (Xf)g(Y,N) =0 (since Y € X(M)). 
To identify V’ with V, we need to check that V’ is torsion-free and metric. First, 


VYY — VX = Vx¥ —VyX + 8(VxY —VyX,N)N 
= [X,Y] + 2([X,Y],N)N = [X,Y], (4.138) 


since V (being the Levi-Civita connection on TM) is torsion-free, and [X,Y] € X (M), assuming 
X,Y € X(M), so that g([X,Y],N) = 0. Second, V’ should satisfy (3.50), i.e. 


X(8(Y¥,Z)) = 8(VxY,Z) + 8(Y,V%Z) (X,Y,Z € X(M)). (4.139) 
This is quite obvious, since for X,Y,Z € X(M) we have 

8(VieY,Z) = 8(VeY,Z) = g(VxY + g(VxY,N)N,Z) = 8(VxY,Z), (4.140) 
since g(N,Z) = 0, and so the right-hand side of (4.139) equals g(VxY,Z) + g(Y,VxZ). By 


(3.50) for V and g, this in turn equals X (g(Y,Z)) = X(g(Y,Z)). This gives (4.139). 
The claim now follows from Theorem 3.9. 


'76Tn other words, if one insists that Vy : X(M) — X(M), one may extend Y € X(M) to an arbitrary vector field 
on M, and if X € X (M ), then VxY is independent of this extension. 
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Eq. (4.136) implies the general Gauss—Weingarten equations, where still X,Y € X(M): 


VxY =Vx¥ +k(X,Y)N (Riemann); (4.141) 
VxY =VxY-k(X,Y)N (Lorentz); (4.142) 
VxN=:-W(X). (4.143) 


Here we take (4.143) to be the definition of the Weingarten map W; : T,M — T,M, noting that 
since g(N,N) = +1, we have g(VxN,N) = 0 and hence VxN € TM. Furthermore, 


k(X,Y) := g(W(X),Y) = -g(VxN,Y) (4.144) 
defines the extrinsic curvature k € X0) (M). As in (4.59), from the property 
g(VxY,N) = -g(Y,VxN), (4.145) 
which is proved as in the text between (4.58) and (4.59), we infer that k is symmetric, viz. 
k(X,Y) = -g(VxN,Y) = g(N,VxY) = g(N,VyX) =k(Y,X). (4.146) 


Eqs. (4.141) - (4.142) then easily follow from (4.136), giving the (“parallel”) component in TM, 
and from taking the inner product with N, using (4.129) - (4.131), giving the normal component. 
We also derive the general Gauss-Codazzi equations, which, for W,X,Y,Z € X(M), are: 
Riem(W,Z,X,Y) = Riem(W,Z,X,Y) + k(W,Y)k(X,Z) - k(W,X)k(Y,Z) (R); (4.147) 
Riem(W,Z,X,Y) = Riem(W,Z,X,Y) +k(W,X)k(Y,Z) -k(W,Y)k(X,Z) (L); (4.148) 
Riem(N,Z,X,Y) = (Vyk)(Y,Z) — (Vrk)(X,Z), (4.149) 


a 


where Riem € £3-)(M) and Riem € X@: DM M) are the Riemann curvature tensor for the Levi- 
Civita connection V on TM (for g) and V on TM (for &), respectively. The Codazzi relation 
(4.149) is the same for the Riemannian and the Lorentzian cases. These equations follow from 
two computations, which we perform for the Lorentzian case, i.e. using (4.142). The first is: 


VxVyZ = Vx(VyZ—k(¥,Z)N) 
= Vy VyZ —k(X, VyZ)N —X(k(¥,Z)) -N —K(Y,Z)VxN 
= VyVyZ+W(X)k(¥,Z) — (k(X,VyZ) +X (k(Y,Z)))N. (4.150) 


The second computation, which uses torsion-freeness of V, 1.e. VyY _ VyX = [X ‚Y iP is 
VieyZ = VixnZ — K([X.Y],Z)N = Vix vZ- (k(Vx¥,Z) —k(VyX,Z))N. (4.151) 
The definition (4.10) of curvature, combined with the “covariant Leibniz rule” 
X(k(Y,Z)) = (Vxk)(Y,Z) +k(VxY,Z) +k(Y,VxY), (4.152) 
which is a special case of (3.65), then yields, after some neat cancellations:!”/ 
O(X,Y)Z = (VxVy — Vy Vx — Vix yj)Z = O(X,Y)Z 
+ W(X)k(Y,Z) —W(Y)k(X,Z) + ((Vyk)(X,Z) —(Vxk)(¥,Z))N. (4.153) 


Taking the (metric) inner product with W and using (4.144) yields Gauss’s equation (4.148), 
whereas the inner product with N and using (4.131) yields Codazzi’s equation (4.149). 


177Recall that unlike k, the metric is covariantly constant, i.e. Vx& = 0 for all X EX (M ), cf. (3.67). 
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4.8 Fundamental theorem for hypersurfaces 


The classical theory culminates in the fundamental theorem for hypersurfaces, which was 
proved (by different means) in the 19th century. We discuss the proof in some detail,!’* since it 


will turn out to be a good preparation for the 3+1 split of the Einstein equations later on. 


Theorem 4.18 Let (M,@) be a connected and simply connected m-dimensional Riemann mani- 
fold equipped with a symmetric tensor k € 3¢ (2.0) (M) satisfying the Gauss—Codazzi equations 


Riem(W,Z,X,Y) +k(W,Y)k(X,Z) —k(W,X)k(Y,Z) = (4.154) 
(Vxk)(Y,Z) —(Vyk)(X,Z) =0. (4.155) 


Then there exists an isometric embedding F : M — R”+! for which the extrinsic curvature is the 
given tensor k. Such an embedding is unique up to isometry, which in the case at hand (i.e. R"*! 
with Euclidean metric) means: up to combinations of translations, rotations, and reflections. 


Note that (4.154) - (4.155) arise from (4.147) - (4.149) by putting Riem = 0 (because R”+! is 
equipped with the flat Euclidean metric), and have (4.79) - (4.80) as their coordinate version. 
The latter were admittedly written down and derived for m = 2, but simply letting the indices 
a, B etc. run from 1 to m rather than from 1 tot 2 immediately generalizes our treatment of the 
classical theory of surfaces to any dimension (alas with some loss of visualisability). 

We just prove a local version of Theorem 4.18 by PDE methods, which is enough to show 
the role of the Gauss—Codazzi equations as integrability conditions. So let us initially assume 
we found an F : U — R”+! satisfying the conditions in the theorem, where U € M is open. We 
make F unique by imposing the conjunction of the following local conditions: 


1. For arbitrary ug € U and xp € IR™*!, the map F satisfies F (uo) = xo; 


2. For some fixed orthonormal basis (eis: ; seni) of Tu M and some given orthonormal basis 
(fi,---fm+1) of To R” T! & R"H, its derivative satisfies F! (ea) = fa (@ = 1,...,m). 


uo 


Without loss of generality we may choose geodesic normal coordinates on U relative to uo, cf. 
(5.33) - (5.38) below, so that eg = da = ð / du“ is indeed orthonormal at least at ug. Furthermore, 
we may pick coordinates (x’) on R”+! (i = 1,...,m+ 1) such that f; = 9 /ðx for i = 1,...,m 
The components F i(u®) of F : U > R”+! then satisfy the (initial) condition 


za (0) = ô (a =1,...,m, i= 1,...,m); (4.156) 
ðF”+! 


In addition to F, we have to define a normal vector field N on U, whose components N’ satsify 


N’ (uo) =0 (i=1,...,m); (4.158) 
N") =1. (4.159) 


178Cf. Kobayashi & Nomizu (1969), $VII.7, considerably rewritten. The argument uses some exterior calculus. 
For general M the above theorem holds at least locally, in that any uo € M has a connected and simply connected 
neighbourhood U € @(M) for which the above claims hold. 
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If we recall (3.61), whose asterisk we omit, for each i = 1,...,m + 1 we have 


(VadF')g =x g- T! (4.160) 


HR r 
where Va := V9/9,«. Therefore, introducing 1-forms 0° € Q(U) for each i = 1,...,m+ 1 via 
6' =dF', (4.161) 

Gauss’s equation (4.77) for (X) turns (4.160) with (4.161) into 
(Va0')g = kop" (a, 8 =1,...,m). (4.162) 


Conversely, if 6’ € Q(U) satisfies (4.162), there exists F! € C”(U) such that (4.161) holds. We 
start with a computation for any 6! € Q(U), which uses the Leibniz rule (3.65):'7° 


d0'(X,Y) =X(6'(Y)) —¥(0'(X)) — (X, Y]) 
= (Vx6")(Y) + 0'(Vx¥) — (Vv6")(X) — 0'(VyX) — EX. Y]) 
= (Vx6")(¥) — (Vy8")(X) + EEK, Y)) 
= (Vx6")(Y) — (Vv6')(X), (4.163) 


since the Levi-Civita connection V is torsion-free, cf. (3.43). Eq. (4.162) then gives 
dO" (du, Og) = (Va8") (0g) — (Vg 0") (da) =N' (kag — Ega) = 0, (4.164) 


by symmetry of the extrinsic curvature k. The Poincaré lemma then gives (4.161). 
It is convenient to replace the 1-forms 9° by the corresponding vector fields Z’ = #(@') on U 


(i= 1,...,m+ 1), in terms of which (4.162) becomes, writing zP for (zÜ)P: 


əz z 
i B ZY = NKB, (4.165) 


Similarly, in terms of Z;, Weingarten’s equation (4.78) becomes 


ON! 


—ĵ B 
aye —kapZ; - (4.166) 
We may rewrite the coupled PDEs (4.165) and (4.166) on U, i = 1,...,m + 1, more elegantly as 
VxZ' = N'W(X); (4.167) 
XN' = —k(X,Z’), (4.168) 


for X € X(U) and N! € C” (U), subject to the initial conditions (4.158) - (4.159) for N', with 


Z” (uo) = ô% (6 = 1, i= 1,...,m); (4.169) 
Z% (uo) =0 (œ =1,...,m). (4.170) 


M 


We derived (4.167) - (4.168) with (4.169) - (4.170) from the existence of F : U > R”+! with 
the desired properties (as stated in the theorem). Conversely, if we can solve these equations for 
Z' (and N’), we are able to construct F, having the right properties, via 0' = b(Z') and (4.161). 


'79Tn the first line we use the identity d@(X,Y) =X(@(Y)) —Y(@(X)) — @([X,Y]), valid for any @ € Q(U). 
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We now show that this can be done. To begin with, we show that the integrability conditions 
for (4.167) - (4.168) are the Gauss-Codazzi equations, which should come as no surprise, since 
(4.167) - (4.168) are a version of the Gauss-Weingarten equations. From (4.168) we derive 


[X,Y]Nİ = -Xk(Y,Z) + Y(X, Z^); (4.171) 
[X,Y]N’ = —k([X,Y],Z’). (4.172) 


so that Xk(Y,Z') —Yk(X,Z') =k([X,Y],Z'); a computation very similar to (4.163) then rewrites 
this as Codazzi’s eq. (4.155). Similarly, practically the same computation as (4.150) - (4.153), 
using (4.155), shows that (4.167) implies Gauss’s eq. (4.154). Thus the Gauss—Codazzi equations 
are necessary for the solvability of (4.167) - (4.168), which explains their role in Theorem 4.18. 

To show that they are also sufficient, we have to make our hands dirty (as usual in PDE theory). 
We take geodesic normal coordinates (u®) relative to ug € U (it may be necessary to shrink U 
in order to make it a normal nbhd) and some fixed orthonormal basis (e1, ates Em) of TM , SO 
that the coordinates (u!,...,u”) specify the point u = yg(1), where yg is the (unique) geodesic 
having Y;(0) = uo and %;(0) = u”eg (summation convention!). 

For fixed u € U, define a vector field Z’ and functions N’ along this geodesic yz by solving 


Vi = = N'W (ți); (4.173) 
„N! = EC ja ZÙ), (4.174) 


at least for r € [0,1], or, in coordinates, where Z' = (Z},... ,Zi") as above, and tu = Y;(t), 


B 
u +u TR (tu) ZE (t) = Ni(t) RE (tu)u; (4.175) 
an = ~kap(tuyueZ?, (4.176) 


with initial conditions Z (0) = 6% (i < m), Z& (0) = 0, N'(0) =0 (i < m), and N„+1(0) = 1, 
cf. (4.169) - (4.170) and (4.158) - (4.159). Here we identified Z(t) with Z' (tw), etc. By standard 
ODE theory, Z’ (t) and NÏ (t) exist and are unique. Finally, define Z’ € X(U) and N! € C” (U) by 


Z'(u) = 21) (4.177) 
N'(u) = Nİ (1), (4.178) 
where the Z’ and N’ on the right-hand side depend on u by construction. We claim that this pair 


(Zİ, N’) solves (4.167) - (4.168) with the right initial conditions (4.169) - (4.170) and (4.158) - 
(4.159). To prove this, it is convenient to introduce two constant vector fields on U by 


X =a (&=1,... m); (4.179) 
Y = a%ðq, (4.180) 
where (as Se ‚a") are the normal coordinates of some fixed a € U. The equations 
VyZ' =N'W(Y); (4.181) 
YN' = —k(Y,Z'), (4.182) 


then hold along the geodesic y(t) for t € [0,1], since there they coincide with (4.173) - (4.174). 


Fundamental theorem for hypersurfaces 


83 


We claim that along y;(r) the functions (ZÍ, N’) defined by (4.177) - (4.178) also satisfy 


Vy (VyZ' — N'W(X)) = (XN'+k(X,Z'))W(Y); (4.183) 
Y(XN'+(X,Z')) =—k(Y,VyZ' —N'W(X)), (4.184) 


which equations are none other than (4.181) - (4.182), with the substitutions 


ZÍ ~» VyZ' — N'W (X); (4.185) 
Nİ ~> XN'+(X,Z'). (4.186) 


Note that the initial conditions for (4.183) - (4.184) follow from those to (4.181) - (4.182), viz. 


VxZ'(uo) — N (uo) Wu (X) = 0; (4.187) 

XN! (uo) + Rum (X, Zz) =O, (4.188) 

Indeed, by the construction of geodesic normal coordinates, at the point uo, the pair (Z',N') 

satisfies (4.181) - (4.182) for any Y, and so in particular for X. The point now is that, (4.183) - 

(4.184) being a first-order system, its unique solution with initial conditions zero is zero, which 
by (4.185) - (4.186) shows that (Z', N’) solves (4.167) - (4.168), with given initial conditions. 


It remains to derive (4.183) - (4.184) from (4.181) - (4.182) and the Gauss-Codazzi equations. 
The argument should be familiar by now, but here we go! To derive (4.183), we compute 


Vy (VxZ' —N'W(X)) = VyVxZi — (YN')W(X) —N'Vy (W (X) ) 
= VxyVyZ!' —O(X,Y)Z' — (YN')W(X) —N'((VyW)(X) +W(VyX)) 
= Vx (N'W(¥)) +k(X,Z')W(¥) —k(Y,Z')W(X) 
— (YN')W(X) —N'((VyW)(X) +W(VyX)) 
= (XN'+R(X,Z'))W(Y) +N'(Ÿx(W(¥)) — (VrW) (X) —W(WyX)) 
= (XN'+k(X,Z'))W(Y), (4.189) 


where we use (4.153) to pass to the second line and use (4.182) to cancel the term k(Y,Z')W(X) 
on the previous line. Finally, the coefficient of N’ in the penultimate line is zero by Codazzi’s 
equation (4.155), which emerges after using (3.65) to write Vx (W (Y) = (VxW)(Y) +W(VxY), 
and noting that W(VxY) —W(VyX) =W(VxY —VyX) = 0 because VxY = VyX, since V is 
torsion-free and [X mG [= = 0 for the constant vector fields (4.179) - (4.180). Similarly, to derive 
(4.184), using eqs. (4.182), (3.65), Codazzi’s (4.155), and (4.181), we compute 


Y (XN! +k(X,Z')) =XYN' +YK(X,Z') = —XK(Y,Z') + YR(X,Z') 
= (Vyk)(X,Z') — (Vxk)(Y,Z') +k(VyX,Z') —k(Vx¥,Z') 
— k(Y,VxZ') + k(X,VyZ') 
= —k(Y,VyZ') +K(X,N'W(Y)) 
= —k(Y,VxZ' —N'W(X)), (4.190) 


since k(X,W(Y)) =k(Y,W(X)), which in coordinates is the identity kaygksg = kpyg”Ksa- 
This proves (4.184) and completes the local proof of Theorem 4.18. 
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5 Geodesics and causal structure 


In this chapter we introduce the causal theory of space-times, culminating in the key notion of 
global hyperbolicity. This theory also allows us to study local and global length-extremizing 
properties of geodesics, which are needed for the singularity theorems in the next chapter. The 


link with curvature as studied in the previous chapter is provided by the topic of the next section. 


5.1 Geodesic deviation and Jacobi fields 


In this section we give an interpretation of curvature through geodesic deviation. This applies to 
both Riemannian and Lorentzian metrics and for the latter is a physical phenomenon, even a key 
prediction of GR. Let U € O(R?) be connected and let y: U — M be a family of curves: with 
(s,t) € U we write y(t) = y(s,t), regarding t as the ‘time’ parameter on each curve %,, and s as 
a parameter labeling the curves. Apart from the vector field tangent to y,(t) along the r-flow, 


OYs 
Ot’ 
on y(U), which gives the tangent vectors to each y, for fixed s as t “runs”, we now also have a 
second vector field tangent to y(t) along the s-flow, i.e., 


%=1.(0/dt) = (5.1) 


1'=1(9/9s) = = (5.2) 
Let V be the Levi-Civita connection on TM. For any vector field Z defined on y(U), abbreviate 
VZ=VyZ; VZ=Vy2. (5.3) 
Since [9/9s,9/dt] = 0 on U C R? by standard calculus, on y(U) we have, cf. (4.12), 
[h] = 0. (5.4) 
Therefore, because V is torsion-free we have the important identity 
Vite’ = Ves. (5.5) 


Another application of (5.4), with (4.10), is that for any Z € X(y(U)) we have 
[Vii Vs]Z = O(%,. Y% Z. (5.6) 


Now assume that each curve t +> y(t) is a geodesic, so that V,% = 0, and take Z = %. Using 
also (5.5), eq. (5.6) becomes the Jacobi equation or equation of geodesic deviation 


aye dye ay” ay” 
a. 1 : 2 — pP S S 
Vis =0% Ys V; ( Js ) = Rouv Ar Or Js (5.7) 


We now change perspective and start from a single geodesic y. We then define a Jacobi field 
along y as any vector field J, defined along y, that satisfies Jacobi’s equation 


V?J = Q(ġ,J)ý; (5.8) 
dy" dy? 
27p — pP 


Clearly, any one-parameter family of geodesics produces a Jacobi field along any fixed one of 
them by the above procedure. Conversely, Jacobi fields give rise to such a family: 
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Proposition 5.1 Any solution J of (5.8) or (5.9) along a geodesic y enables one to extend y to a 
one-parameter family (Ys) of geodesics for which Y = Yo and 


= Vo. (5.10) 
This will be proved in the next subsection, since we need the exponential map for the proof. 


Proposition 5.2 The collection of Jacobi fields along a given geodesic y: |a,b| — M forms a 
vector space Jy of dimension 2dim(M). Specifically, one has a linear isomorphism 


Jy S Ty(q)M ® TyaM; (5.11) 
J+ (J(a),V,J(a)). (5.12) 


Moreover, if J (a) and V,J(a) are both orthogonal (parallel) to (a), then J(t) and V,J(t) remain 
orthogonal (parallel) to y(t) for allt € [a,b]. In the parallel case, one simply has 


J(t) = (c + (t-a)c2) y(t), (5.13) 
for given initial conditions J (a) = cıY(a) and V,J(a) = c2Y(a), independent of Riem. 18° 


Proof. Eq. (5.8) or (5.9) is a linear second-order ODE for J, which may be rewritten as a linear 
first-order system K? (t) = V,JP (t) and V,K? (t) =A§(t)J°(t). For such systems solutions not 
merely exist for short times, but for all t for which the matrix AZ (t) is defined. The proof of the 
other claims is almost trivial and is left to the reader. 


Jacobi fields play an important role in the variational properties of geodesics. To explain this we 
compute the second variation of the length functional (3.16) in the Riemannian case and insert 
the appropriate sign(s) for the Lorentzian case at the end. First, we recompute the first variation, 
using the powerful notion of the covariant derivative that was not yet available in $3.2. Note that, 
in contrast to our discussion of Jacobi fields, here we neither assume that each Y, is a geodsic, 
nor (for later use in computing the second derivative) that it is parametrized by arc length (i.e. 
has constant speed). Using (3.65) and (3.52), (5.3), and (5.5), we obtain 


dL(Ys) _ 
N 5h aen OKD) 


_ Enlt) V,%(t),%(t)) Ad) (Viy (t) Y(t) 
fa ion (%(t),%(t)) = [4 Tee ORION u 


If we now do put s = 0 (with % = y) and do assume constant speed, say ||Y(?) || = v, we continue: 


~ n (Viy — zo = "fa ld leyo (Y (E) (4) — gyu)(7'(t),ViV(t))] 


1 /|* b 
= = ( (7.7) -f dt er): (5.15) 
b a 
since V; = Vy. For fixed-endpoint variations, where y’(a) = y'(b) = 0, we therefore obtain 
dL Ys 
L(y) =A (5 <0) = 1 Karl), (5.16) 


'80This proposition is true in the Riemannian case, and for non-null geodesics in the Lorentzian case (cf. §5.3). 
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since the boundary term in (5.15) vanishes. Thus we see that the extremality condition Z/(y) = 0 
enforces the geodesic equation (3.48), since y’ in (5.16) is arbitrary and g is nondegenerate. 
We now compute the second derivative of L(y) from (5.14): 


PU), Pa? (00r OKO) 
ds? n= fas re (X(t), H(t) 


= ak dt [gy,(1)(VsVi% (t),¥s(t)) + 8ye (Vers (t), Vs%s(t))](s = 0) 


Ey) = (s =0) 


4 ala ro (517 
where we used (5.5) to obtain the last term. Rewriting the first term using (5.6) gives 
8y(VsVit ss) s=0 = Ey [Vs V] yt) +8Y(VıVsY',Y) 
AN 6. 


In the last line, the first term equals —Ry(Y, y’, ý, y), the second term vanishes for geodesics, and 
for fixed-endpoint variations the third term also vanishes upon integration f 7 dt. Furthermore, 
we use (5.5), so that gy(V:Ys', Vst) = 8y(V1Y%s', VtYs'). Introducing the component 


y= -vey NY (5.19) 


of y’ that is perpendicular to ý, we have, omitting terms containing V;ý = Vyy = 0, 


1 À 
8y VIY VY") = Flat" y) = gy(VıeYL,VıYi)- (5.20) 


Up to a boundary term vanishing upon integration for fixed-endpoint variations, we may replace 
the pa -hand side a -gY(YL: Me y1). By the symmetries of the Riemann tensor, we have 


so that we finally obtain Synge’s formula for the second variational derivative of L(y):18! 


Uy) =—* | aeo OLO -AOLO (5.21) 
= Pal VALOIA- OAO 62 


Note that we did not assume that the curves Ys were geodesics, except Yo = Y. In the Lorentzian 
case, for timelike curves,'** one obtains minus the the right-hand sides in (5.21) - (5.22); this 
sign goes back to the one in (5.114); we invite the reader to redo the calculation for this case. 
As in calculus, L(y) is a local minimum iff L’(y) > 0, whereas it is a local maximum iff 
L’(y) <0. We will see shortly that in the Riemannian case for small ¢ one starts out with 
L" (y) > 0, so that at least for short times geodesics are locally minimizing. It is clear from (5.22) 
that in case of negative sectional curvature this will always remain the case, but in general, L” (y) 
may go through zero. According to (5.21) and (5.8), one has L” (y) = 0 precisely when yí is a 
Jacobi field. After a necessary technical intermezzo, in §5.4 we analyze what this means. 


181 See Synge (1960), $I.6., eq. (136). It is quite remarkable that not just in the first variation (5.16), where it is 
expected, but also in the second variation (5.21), only the first s-derivative of the family ys appears. 
182.4 curve t +> c(t) is timelike if ge (E(t), ė(t)) < 0 for all £ where c(t) is defined. See §5.3 below. 
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5.2 The exponential map 


Given some metric, fix x € M and define % C T,M as the set of vectors X € T,M for which the 
geodesic t +> Ke? (t) emanating at x with initial velocity X, i.e., 


(0) =x; ¥)(0) = X, (5.23) 


is defined at least for 0 < t < 1. If (M,g) is complete, then % = T,M for all x. Note that each 


¥, is automatically open and star-shaped in that X € ⁄ implies tX € % for all t € [0,1]. This 


follows because for any t for which Ye (t) is defined (for given X), and any p > 0, one has 


WE) = H (02). (5.24) 


Indeed, the left-hand side solves (3.24) with the same initial condition as the right-hand side. 
The exponential map exp, : V. — M (based at x) is then defined by 


exp,(X) = 7 (1). (5.25) 


This map underlies many proofs and hence we take the reader through its main features. !°° 


1. At each x the set V, C TM can be shrunk to an open star-shaped subset %, C Y on which 
exp, : Vs > M 


is a diffeomorphism onto its image U,, called a normal neighbourhood of x. This follows 
from the inverse function theorem plus the observation that the derivative (pushforward) 
of exp, at 0 € T,M is the identity map (as is easily verified). Moreover, U, may be shrunk 
to a convex neigbhourhood W, of x: this means that W, is a normal nbhd of any of its 
points, so that any two points of W, may be connected by a unique geodesic. 

Eq. (5.24) then implies that exp, maps each line segment {tX |O<r< 1} in TM to the 


geodesic segment fy? (t)|O<r<1}inM. Conversely, geodesics within U, emanating 
from x are flattened by exp,'. This is because any point y = exp,(X) € U, is connected 


x (x) 


to x by a unique geodesic within Ux, viz. Yg, where X = exp, ! (y) (there may be other 
geodesics from x to y, but if so, these leave Uy). To see this, consider some geodesic 


c: [0,1] > M; c(0) =x; c(1)=y, (5.26) 
and take Y = &(0). Uniqueness of geodesics c with given initial data c(0) and ¿(0), yields 
c(t) = (0). (5.27) 
Then c([0,1]) C U; implies Y € %,, and the endpoint matching condition 
== (5.28) 
(x) 


enforces Y = X, which implies c = fy ’. 


183 O’Neill (1983), Senovilla (1998), and Minguzzi (2019) are good references for this material. 
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2. Jacobi fields give the pushforward of the exponential map. For each X € %, we have 
(exp, )x : Tx (4M) > Top (x)M- (5.29) 
Identifying 7x (T,M) = TyM, which is done through the identification 


dA(X-+1Z 
ZET,M < +, =0) € Tx(T,M), (5.30) 


(5.29) becomes a E map (exp,)x : TM >T, 


exp, 


(x)M. Take Z € T,M (not necessarily 


orthogonal to X = A (0 )) and let Jz(t) be the Jacobi field along y with boundary 
conditions J (0) = 0 and V;Jz(0) = Z. For t € [0,1]) we have Jz(t) = (exp,, );x (tZ), so 


(exp. )x(Z) = Jz(1). (5.31) 


3. The exponential map leads to the idea of (geodesic) normal coordinates (GNC) relative to 
both some point xọ € M and a choice of an orthonormal basis (eu) of T,,M. That is, 


8xo(€u»€y) = Suv (Riemannian case); (5.32) 
Exo (Eusev) = Nuv (Lorentzian case). (5.33) 


These coordinates are defined on the chart U,,, as follows: the normal coordinates of 
x € Uy, are the coordinates of exp,,' (x) € T,,M with respect to the given basis of TM. 


In other words, if X = x"e,,, then x = exp, (X) has GNC x”, or, equivalently: 
The normal coordinates (x") label the point exp, (x"ey) = TA (1). 


In particular, x9 has GNC x" = 0. By definition of exp,,, and using (5.24) we also have 


S500) = 4 


d ee 
De = ZH O= = euf), 634 


so that in 7,,M we have Ou = e, and hence, say for the Lorentzian case, by (5.33), 
Suv(0) = Ex (Cu (t) ey) = Nuv- (5.35) 
In GNC, by (5.24) the coordinates of the curve t > Yu. (t) are 
en” (5.36) 
which is clearly a geodesic. The geodesic equation (3.24) then gives 
Pins) =D, (5.37) 
For t = 0 and t = 1 this gives, respectively, 


ry OG; (5.38) 
lian a =O, (5.39) 
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From (4.15), the first one gives 
Ovgpu + Augpv — pguv = 0. (5.40) 
Cyclic permutation of indices gives 
Ivgpu — Augpv + 9guv =Q. (5.41) 
Adding these equations sharpens (5.38) to 
ðpguv(0) = 0. (5.42) 
But this is where the buck stops: for the second derivative a calculation shows that 
9pdoguv(0) = —F(Ruovp (0) +Rupve(0)), (5.43) 
which is equivalent to (4.39) and, in GNC, to 
guv(x) = Nuv — 4Rupvox?x? + O(x°), (5.44) 


Finally, since f(t) = 8,0) (1) (40) (t), 7) (t)) is constant along Ye, i.e. int, in GNC 


Suv (x) x? x" = guy(O)xPx” = Nyyxtx”, (5.45) 


as the left-hand side is f(1) whilst the right-hand side is f(0). This will be used shortly. 


. Gauss’s Lemma (which will be used in §5.4) sharpens (5.45) to 


Suv(x)x” = guy(0)x", (5.46) 


or, in coordinate-free form, for arbitrary X € % and Z € T,M), 


Sexp, (x) ((€XPx) x (X), (exPy)x(Z)) = gx(X,Z). (5.47) 


This states that although the presence of the curvature in the right-hand side of (5.8) 
prevents the exponential map from being an isometry (which it is in flat space), the radial 
component of any vector along a geodesic preserves its length under exp,. To see that 
(5.47) is equivalent to (5.46), note that according to (5.36), in GNC we have 


((expy)x (X))" SAN, (5.48) 
so if we use (5.30) with £ ~> s, by definition of the pushforward (exp,) we obtain 
(exPx)x (Z) = d (exp, (X + 8Z))js=0; (5.49) 


which in GNC gives 
((exp,)x(Z))" = Z". (5.50) 


Hence the left-hand side of (5.47) is guv(x)X"Z’, and since the right-hand side is obvi- 
ously guy (0)X# ZY, we have proven the said equivalence. 


The exponential map 


91 


To prove (5.46) and hence (5.47),'°* we note that (5.39) with (4.15) implies 
(2gup.v — guv,p x" x” = 0. (5.51) 
Furthermore, taking (5.45) at arbitrary t, we have 
Suv (tx)x?x” = guy(O)x?x", (5.52) 
whence, by taking the derivative dp of both sides, 
tguv,o (tx)x¥x” + 2g (tx)x = 2gup(O)x". (5.53) 


Combining (5.51) and (5.53) yields 


d 


Hence we may evaluate the expression between brackets at t = 1, which gives (5.46). 


Combinig (5.24), (5.31), and (5.47) gives, along the geodesic y (at least for t € [0, 1) 


8, Ux )-Iz(t)) = PR): (5.55) 


For example, on M = KR" with Euclidean metric (i.e. gj; = Ôi j) one simply has 


Jz(t)=tZ. (5.56) 


5. We now prove Proposition 5.1. Given y(t) and J (t), let c(s) be the unique geodesic with 
c(0) = (0); c'(0) = J(0), (5.57) 


where s € (-6,6) for some 6 > 0, and c’(s) = dc(s)/ds as usual. Then define vector 
fields V(s) and W(s) along c(s) as the unique solutions of 


VaV (s) =0; v (0) = y(0): (5.58) 
VW =0; w (0) = V,J(0). (5.59) 


Then the following family does the job: 
Y(t) = expec) (EV (s) + SWS). (5.60) 
e For fixed s, this is %⁄ : t > exp, (tXs), with x, = c(s) and Xs = V (s) + sW (s). Now 


exp, (tXs) = 4x (1) = %,(t) (5.61) 


by (5.24), so Ys = Yx,, emanating from %(0) = xs. This is surely a geodesic! 


Bd, (5.47) may also be proved from (5.31), cf. O’Neill (1983), Lemma 5.1 or Jost (2002), Corollary 4.2.2. 
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e To prove (5.10), we initially put 


ais A. 
F(t) =F (s=0). (5.62) 
Then, using (5.57) - (5.61), we compute 
e d exPe(s) (0) d i 
(0) = PT (¢ = 0) = EO e0) = eo) 
=J(0); (5.63) 


2 o d 
VJ (0) = Vig% t) s=1=0 = Vox I(t) \s=1=0 
= Vu(V(s) +5W(s))j0 = W (0) 
= V,J (0). (5.64) 
Since J and J solve the same Jacobi equation along y, this implies J = J. 


6. Finally, though not needed in what follows, the mathematical underpinning of the equiv- 
alence principle as usually conceived is given by the following extension of geodesic 
normal coordinates.!?° We just do the timelike Lorentzian case (which covers what is 
physically needed; the adaptation to the Riemannian case is obvious). Let Y: (a,b) — M 
where a < 0 < b, be an affinely parametrized timelike geodesic with unit speed, i.e. 


ayy) =-1, (5.65) 


and let (eo,eı,e2,e3) be an orthonormal frame in T,0)M, i.e. (5.33) holds, with eo = y(0). 
Parallel transport this frame along y, i.e., the frame (e,()) at TyM solves 


Vinten (t) = 0; eu(0) = eu (5.66) 


so that in particular eo(t) = y(t). The Fermi normal coordinates (x! ) then refer to 


(xt) > EXP (0 (È ei ) , (5.67) 


which defines a coordinate system in a suitable open nbhd of y. It follows that at y(t) one 
has ðu = eu (t), so that similarly to the case of GNC along y, i.e. Vt € (a,b), one obtains 


a )) = nuv; (5.68) 
av (Y(t) = 0; (5.69) 
a =0; (5.70) 
goo,ij(Y(t)) = —2Row;(r(t)); (5.71) 
Bok ij(Y(t)) = 3 (Rozik (Y) ) + Roije(y(t))); (5.72) 
8imij(V(t)) = —4 (Rijm (Y(t) ) + Rim (Y(t) )); (5.73) 
8uv.op(Y(t)) =0. (5.74) 


185This refers to version 3(a) of the equivalence principle (which was not Einstein’s), see §1.1. Fermi normal 
coordinates were introduced, along arbitrary curves, by Fermi (1922). See also Misner, Thorne, & Wheeler (1973), 
§13.6. The specialization to geodesics is taken from Manasse & Misner (1963). A similar construction even works 
for higher-dimensional submanifolds S (instead of geodesics), provided S carries dim(S) linearly independent vector 
fields each covariantly constant along S. See Schouten & Struik (1936), p. 106 and O’Raifeartaigh (1958). 
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5.3 Basic causal structure in Lorentzian manifolds 

The following definitions are unique to Lorentzian geometry. A vector X, € TyM is called: !8° 
e timelike if g,(X,,X,) < 0 (so X, # 0), and spacelike if g,(X,,X,) > 0 or X, = 0; 
o lightlike if g,(X,,X,) = 0 and X, Æ 0, and null if g,(X,,X,) = 0 (so X, may be zero); 
e causal if g,(X,,X,) < 0 and X, #0 (i.e. X, is either timelike or lightlike). 


We denote the sets of these vectors at TyM by Jy, I, Ly, Mx, and Gy, respectively; that is, 


Tre = {XE TM | Bx (Xx.Xx) < O}; (5.75) 
Sx = {Xx E€ TM | 8x (Xu, X) > 0 or Xx = 0}; (5.76) 
Ly = {X, € TM | 8x(&,Xx) = 0,X, #0}; (3.77) 
N = {Xx € TM | 8x(Xx,Xx) = 0}; 2.18) 
ee EIN | 6:0. %) <0.%, 70), (5.79) 


Diagonalizing the metric gy to the Minkowski metric n, cf. (3.2), one sees that the set % of all 
timelike vectors in 7,M is disconnected, with two connected components: for any fixed X, € 7, 
one component T consist of all Y, € T,M such that g,(Xx, Yy) < 0, whereas the other, 7, 
contains all Y, with gx (Xx Y.) > 0. In Minkowski space-time M, for any x, taking X, = (1,0,0,0), 
we think of Y, € Z+ as being future-directed (recall that n = diag(—1,1,1,1)), and of Yy € Z- 
as past-directed. More generally, we call a Lorentzian manifold M time orientable if it has a 
global time-like vector field T € X(M), i.e., gx(T,,Tx) < 0 at each x. In that case,!?’ we define 


I := {X; € J; | 8x(Te Xx) < 0}; CT = {Xy E Ce | 2,(T Xx) < 0%; (5.80) 
TO eK E€ To | 8x To Xx) > O}; CT = {Xx E Ca BER) > 0}, (5.81) 


which gives a continuous choice .7,* of a distinguished component of .% as x varies, and 
similarly for causal and lightlike vectors.!** Topologically we have, also without the + suffix, 


TH) =in(C); IZ =a¢ = Z= Ufo}. (5.82) 


Given a global time-like vector field T, a causal vector X, is future-directed (fd) if X, € CF, 
and past-directed (pd) if Xy € €. We call GU {0} C T,M the lightcone at x, with forward 
lightcone ©, and backward lightcone %⁄— (globally, there are no such things, in general 
Lorentzian manifolds). If M is time orientable, many choices of T will give the same J+- 
component of .%, namely any T’ for which g,(T,,T/) < 0 for all x. If we say that T ~ T’ if 
this is the case, where both T and T’ are timelike, this leaves only two equivalence classes, 
represented by T and —T. Each of these defines a time orientation of (M,g), and we see that 
a time-orientable Lorentzian manifold has just two possible time orientations. Since utmost 
generality is not our goal, we include a time orientation in our definition of a space-time: 


Definition 5.3 A space-time is a 4d connected Lorentzian manifold with time orientation. 


186We use the conventions of Minguzzi (2019), whom we thank for advice on this point. 

'87Counterexamples to this are quite artificial and it can be shown that every Lorentzian manifold has a double 
cover that is time orientable, cf. Minguzzi (2019), $1.7. 

188 Conversely, such a choice defines a global time-like vector field T € X(M ) and hence a time orientation. 
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In both the Lorentzian and the Riemannian case, the “length” of X, € T.M may be defined by 
IK = y Is KA). (5.83) 


For spacelike vectors Xy, Yx this “norm” satisfies the triangle equality |X + Yı|| < ||Xx|] + ||Yel| 
as well as the Cauchy-Schwarz inequality |@(Xx, Yx)| < ||Xx||||Y. ||, with equality iff X, and Yx are 
collinear; the rule g,(X,,¥%-) = ||Xx||||Yx|| cos @ defines an angle @ between X, and Y,. However, 
for causal vectors X,,Y, the opposite Cauchy-Schwarz and triangle inequalities hold:!*? 


18x: X, Ye) | > Xl- Ex; IX + Yel] > [X] + | [Yell (5.84) 
with equality iff X, and Yy are collinear. The hyperbolic angle @ between X, and Yy now satisfies 
&x(Xy, Ve) = F||XzI|||¥-|| cosh 0, (5.85) 


with minus (plus) sign if X, and Y, are in the same (opposite) time cone(s). 

We now define the corresponding global notions in M that replace the “infinitesimal” notions 
of timelike etc. in each tangent space TyM. Properties of curves are defined through their tangent 
vectors: thus a curve y is called (fd) timelike if all its tangent vectors y are (fd) timelike, i.e. if 


Y(t) € A T for all t, (fd) lightlike if all its tangent vectors are (fd) lightlike, (fd) causal if all 
its tangent vectors Y are (fd) causal, and spacelike if all its tangent vectors are spacelike. For 
example, in Minkowski space-time (1,0,0,0) is a timelike vector, so that t > y(t) = (t,0,0,0) 
is a timelike curve (even a geodesic), since 7(t) = (1,0,0,0).!°? This terminology for curves in 


turn allows us to define various relations on M, of which the three most important ones are: !°! 
e It: (x,y) € I" ory €1* (x) orx <y if there is a fd timelike curve from x to y; 
e Jt: (x,y) €J* ory €J* (x) or x < y if there is a fd causal curve from x to y, or x = y; 
è E* :=J*\I* (called horismos): y € E+ (x) if (x,y) € J* but (x,y) ZIT. 


There are no timelike curves of zero length from x to x, so usually (x,x) ¢ I", but if (M,g) 
admits closed timelike curves (like Gédel’s space-time), then (x,x) € I". On the other hand, 
(x,x) € JT is always true by convention. As usual, we write x < y if x < y but x Æ y. There are 
similar relations J~, J~, and E~ defined by (x,y) € I* iff (y,x) € I”, etc. This gives rise to 


I+ (x) = {yeM |x<y}; I(x) := {yeM|y <x}; (5.86) 
J+ (x) := {y EM |x < y}; J- (x):= {y EM |y <x}; (5.87) 
E+ (x) := J+ (x) WT (x); E~ (x) := J (2) (x). (5.88) 


In Minkowski space-time R*! these sets are easy to compute, with the result: 


It (x) = {y E RÍ |y? >X + |7- žl}; (5.89) 
J+ (x) = {y E RÍ | y? > x? + |7- žl}; (5.90) 
E+ (x) = {y E€ RÍ | y? =x° + |5- ž||}- (5.91) 


189 See O’Neill (1983), Proposition 5.30 and Corollary 5.31 or Minguzzi (2019), Theorems 1.2 and 1.3. 

190 In physics, timelike curves are potential trajectories of massive particles, whereas massless particles move on 
lightlike curves. Physical information should spread along causal curves; it will be an important task to prove this. 

191 emma 5.8 below implies that it does not matter if one uses smooth or piecewise smooth (or even C!) curves. 
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The idea is that J* (x) is the causal future of x, consisting of all points of M that x can possibly 
influence (namely by signals or actions propagating with at most the speed of light). 
More generally, the following subsets of M are defined causally for any subset A C M: 


PFA := Uxeal* (x); J*(A) := UyeaJ* (x); (5.92) 
E*(A) :=UyeaE*(x) E*(A) := JF(A)\ (A), (5.93) 


where is should be noted that &*(A) C E*(A) without equality. Here are some basic facts. 


Proposition 5.4 1. The relations I* and J* are transitive (but E* is not). 
2. For any A CM the set I*(A) is open in M; in particular, I*(x) is open for any x € M. 
3. The relations I~ are open (i.e. I! C M x M is open). 
4. Ifx<yandy<zorx<yandy <z then x < z. Consequently, for any A C M, 


I) ZN) ZEN Sse CEE IM a), Ei 


5. For any A C M one has the relations-with double equality in (5.96) iff J+ (A) is closed-: 


I+ (A) ima): (5.95) 

IA) e mA =A: (5.96) 

ðI (A) = əJŤ (A). (5.97) 
192 


Proof. For the first claim, concatenate curves.!”* For the second, we prove that any y € I+ (x) 
has an open nbhd contained in /* (x). By definition there exists a timelike curve Y: x > y. Take 
z on y close enough to y that y € U,, so that y = exp, (Z) for some timelike Z € T,M. Since 
the condition g,(Z,Z) < 0 is open, there is an open nbhd % of Z in T,M consisting of timelike 
vectors. Then V = exp,(Y) is an open nbhd of y, all whose points can be reached from z, and 
hence from x, by timelike curves, so that y € V, C I" (x). With I+ (x), every I" (A) is open. A 
similar argument around x shows that if (x,y) € /*, then Vy x V, C I, so that J is open. 

Claim 4 follows from Proposition 5.13 below: the hypothesis that there exists a causal curve 
from x to z via y that is initially timelike excludes case 3 of Proposition 5.13 (with y ~> z). 

For (5.95), the inclusion /* (A) C int(J*(A)) follows because 7* (A) is open and is clearly 
contained in Jt (A). Conversely, if x € int(J*(A)) then by definition it has anbhd contained in 
J* (A), which may be shrunk to a normal nbhd U,. Thus there is a pd (= past-directed) timelike 
geodesic y emanating from x that lies initially in U, and contains some point z € U,; a priori 
all we know is that y has a pd timelike tangent vector ý at x, but since 7 is constant along y 
this vector remains timelike, and since the condition g(T, 7) > 0 for past directedness is open, 
it will also remain pd at least for a while. Hence x € J*(z), so that, since z € J (A), we have 
x €I*(J*(A)) =I" (A) by (5.94). This gives the inclusion int(J*(A)) c I* (A), and (5.95) has 
been proved. The proof of (5.96), which implies (5.97), is analogous but slightly more involved, 
and is left to the reader.'?* Eq. (5.96) implies (5.97), since for any set 0S = S\int(S). 


192 And smoothen them out, which is not necessary if piecewise smooth curves are used, cf. footnote 191. 
193 See e.g. O’Neill (1983), Lemma 14.6 (2). In this proof geodesically convex nbhds can be replaced with normal 
nbhds, as we have done in the previous steps of the proof, compared with e.g. Penrose (1972) and O’ Neill (1983). 
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Against our Minkowski intuition, the relations J~ need not be closed.'!°* Nonetheless, in a 


normal nbhd U, of some point x, one expects the causal relations /~, J + and E* to be determined 
by those in 7,M. In Minkowski space-time this is even true globally: If we identify ToM with 
M as usual, then (5.89) - (5.91) show that Z+ (0) = .%* defined above, J+ (0) = 6, U {0}, and 
Er (0) = MT, i.e. the forward lightcone from x. In general, the following theorem is a rigorous 
statement of the idea that “space-time is locally Lorentz.” 

In what follows, for any nbhd U of x, the set i (x) consists of all points y € U such that 
x < y through a fd timelike curve contained in U, and similarly I; (x) for pd timelike curves, 
Ji (x) for fd and pd causal curves, and E7; = Jọ (x) MẸ (x). These sets, in which M is “reduced” 
to U, are not to be confused with e.g. I" (x) NU, which is larger than or equal to I} (x), etc. 


Theorem 5.5 In any space-time the causal structure “near x E M”, i.e. in a normal nbhd 
U, = exp,(%), is determined by its linearized version in T;M, in the sense that: 


no 9202: (5.98) 
Jy (x) = exp, ((€," U {0} N&): (5.99) 
Eg, (x) = exp,(%" 1%). (5.100) 


Moreover, if c(-) is a fd causal curve in U, starting at x, then: 
1. If è(0) is timelike, then c(t) € Ig (x) for all t > 0 where c(t) is defined. 
2. If c(t) € Eg. (x) for all t where c(t) is defined, then c is a lightlike (pre)geodesic.'” 
3. Once c enters Ig (x) (especially after a sejour on Ey (x)), it cannot leave Ij (x). 


Finally, within U, timelike / lightlike / causal geodesics from x € M are precisely the images 
under exp, of timelike / lightlike / causal curves geodesics in T,M starting at the zero vector. 


Point 1 implies that although y € In, (x) by definition means that there is a fd timelike curve 


c from x to y in U,, i.e. ċ(t) is timelike for all t, to guarantee that y € Ih, (x) it is enough that 
there is a fd causal curve from x to y in U, for which only é(0) is timelike. Similarly, although 
ye Ey. (x) by definition says that there is a fd causal curve from x to y in U, with y ¢ Ig, (x), 
point 2 strengthens this to y € Eğ} (x) iff there is a fd causal curve c with c(t) ¢ Ig (x) for all t. 


The proof uses the following facts about Lorentzian metrics, which are of independent interest. 


Lemma 5.6 Let V a vector space with Lorentzian metric g and associated cones defined as in 
(5.75), (5.77), and (5.80), where TM is replaced by V and we omit the suffix x. 


1. fX € Z* andY € 6*7, then g(X,Y) <0. 


2. Ifg(Y,Y) <0 and g(X,Y) = 0 for some lightlike vector X, then Y is proportional to X. 


1%Eor example, remove (t,x) = (1,1) from 2d Minkowski space-time M2 and look at J+ (0,0): the light-ray 
s++ (s,s) from (1,1) is missing. Or remove the closed horizontal line segment from (t,x) = (2,—1) to (2,1) from 
M; and call this My, or Quinten space-time. This removes the closed triangle with corners (2,—1), (3,0) and 
(2,1) from J* (0,0) in Mg, so that the set J* (0,0) is not closed in M3. 

'95Recall that a pregeodesic is a geodesic up to reparametrization. A lightlike curve is not necessarily a pregeodesic, 
e.g. t +> (t,sint,cost) in 3d Minkowski space-time, and so a lightlike curve starting at x may not lie in Ey. (x). 
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Proof of Lemma 5.6. To prove the first claim, write X = AT +X’ and Y = uT +Y’, where 
g(T,X') =8(T,Y’) =0 and A, u > 0. Then X’ and Y’ are spacelike and as such satisfy the usual 
Cauchy-Schwarz inequality |g(X’,Y’)| < ||X’||||Y’||. Assume g(7,7) = —1. Then g(X,X) <0 
gives A > ||X’|| whilst g(Y,Y) < 0 gives u > ||Y’||. Thus g(X,Y) =Au+g(X',Y’)<0. 

The second claim follows (for example) from Lemma 4.16, since the assumption gY ‚Y ) <0 
excludes the possibility that Y is spacelike, so that only the other possibility remains. 


Proof of Theorem 5.5. To ease notation we omit all reference in notation (e.g. as suffixes) to Uy 
and %,, the restriction to which is implied throughout this proof. We use GNC ($5.2), in which 


c(t) = exp, (Cit)); eM = Cra), (5.101) 
where c(t) € M and C(t) € TM. Recall that any C € T,M gives a geodesic yc, which in GNC is 
A= (5.102) 

Consider a fd causal curve c : [0,1] + U, with c(0) = x and ¿(0) timelike, i.e. 


guvlelt))eE (t)ċ” (t) < 0; (5.103) 
guv(c(0))ċ” (0)ċ” (0) < (5.104) 


Since ¿! (t) = lim, ;o c! (t) /t, for sufficiently small r eq. (5.104) implies 
guv(c(0))c” (t)c (t) <0. (5.105) 
Similarly, since ¢“(t) is fd at c(0), so is c” (t), for small t. Eq. (5.45) propagates (5.105) to 
Zuvlelt))c”(t)c” (t) < 0. (5.106) 
Furthermore, differentiating (5.105) and using Gauss’s lemma in the form (5.46) gives 


£ (guv(c(O))cH O O) = 2guy(c(0))c# (r)e” (t) = 2guv(e(t))eH(r)e”(r), (5.107) 


still for small t. Eqs. (5.103) and (5.106), the fact that c*(r) is fd for any ¢ by assumption, as is 
ċ! (t) for small t, and Lemma 5.6.1 make the right-hand side of (5.107) negative definite, so that 


£ (euv(c(0))c#(t)e¥(t)) <0, (5.108) 


for small t. Hence gyy(0)c"(t)c” (t) can only become more negative as ¢ flows, so that (5.105), 


initially derived for small t, actually holds for all t € [0,1]. By (5.101) and (5.33) this gives 
NuvC" (t)C’(t) <0, (5.109) 


for all t € [0,1], or as long as the curve is in Uy. This gives C(t) € Z, but since CH (t) = c (t) 
is also fd for small ¢ and does not leave 7, by continuity we even have C(t) € I for all t. 

If now y € I+ (x), then by definition there is such a curve c with c(1) = y (which is even 
timelike for all £), so that y = exp,(C(1)) with C(1) € Z%* and hence y € exp(.%,*). This shows 
that J+ (x) C exp,(.%*). We now prove the converse inclusion exp, (FF) CI*(x). 

If y = exp,(C) for some C € Z+, then the geodesic (5.102) connects x to y by a fd timelike 


curve. Indeed, recall that the quantity gy, (7(¢), 7(t)) is constant in t if y is a geodesic. Therefore, 
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if y(0) = C lies in Z+, then so does Y(r) for any t. Furthermore, by continuity a timelike curve 
cannot change its direction, since for g..(,)(C(f),T.(,)) to change sign c(-) must leave 7. 

We have now proved (5.98) as well as point 1. Point 3 follows from this, since if y = c(t) € 
I" (x), one may see the remainder of c(-) as the continuation of a fd timelike curve starting at x, 
to which point 1 applies (since J* (x) is open, one may smoothen the joint just before y). 

Next, one shows that each nbhd of each point c(t) on a fd causal curve intersects with 
exp( Z+); this is done by giving c a tiny fd timelike twist.!?° This gives J+ (x) C exp, (CF); 
the converse inclusion is proved as in the timelike case. This proves (5.99). 

To prove the inclusion E* (x) C exp,(.%7), take y € E* (x), so that there is a causal curve c 
from x to y, from which it follows that C(t) € €+ for all t € [0, 1]. In particular, y = exp,(C(1)), 
where C(1) € €+. For any C(1) € %,* there would be a fd timelike curve (as follows from 
the first part of the proof), so that C(1) € .%*. Conversely, if y = exp,(C) with C € 4", then 
y €1* (x) by the first part of the proof, so that y € E* (x). This proves (5.100). 

To prove point 2 we show that, if C(-) lies in @* and c(-) in Jọ (x), then C(-) lies in 
NM iff c(-) is a lightlike geodesic (up to parametrization). From right to left this is obvious, 
since c(t) must be we? (t) =exp,(Ct) with C € 4". For the other way round, the condition 
Nuvc” (t)c” (t) = 0 for C to lie in .%* implies, via (5.45) and (5.46), respectively, that 


Suv(c(t))c* (t)c” (t) = 0; Suv(c(t))e* (t)c” (t) =0. (5.110) 


Hence by Lemma 5.6.2 the vector ¢“(t) is proportional to the lightlike vector c” (t), so that 
guv(e(t))E* (t)e’ (t) =0. This makes c(-) a lightlike curve; the property CH ~ c! (in GNC) makes 
the left-hand side of the geodesic equation (3.24) proportional to ¢, so that reparametrization 
makes c(-) a lightlike geodesic. The final claim then restates what we knew from §5.2. 


Since exp, 1s a homeomorphism and the corresponding equality holds in 7,M, we also have 


jG) =i): (5.111) 
We close this section with a remarkable consequence of Proposition 5.4. 


Corollary 5.7 Any compact space-time contains a closed fd timelike curve. 


Proof. Each set I* (x) is open by Proposition 5.4. Theorem 5.5 shows that U,/* (x) = M (for 
any y the set I~ (y) is not empty and any x € I~ (y) gives y € I*(x)). Since M is compact, 
M=UN ‚I*(x;) for some N < œ. Hence xı € I*(x;) for some i. If i= 1 we have xı € I (x1) 
and we are ready. If not, assume x; € I* (x2), so that x2 < xı. Repeating this argument for the 
other x; gives a chain xy <--- <xı. But also xy € Ir (xi), which gives x; & --- << xi. 


Compactness is sufficient but not necessary for the existence of closed fd timelike curves: for 
example, Gödel’s space-time is topologically R* and famously contains such a curve.!?’ Perhaps 
without real justification, closed fd timelike curves are supposed not to exist in the real world. 
But despite Corollary 5.7, there is a lively mathematical literature on compact space-times.'?® 


196 See Senovilla (1998), Proposition 2.1 or Minguzzi (2019), Theorem 2.9 and Corollary 2.10. The difficulty of 
this step may be illustrated by the fact that the argument in Hawking & Ellis (1973), Proposition 4.5.1, is wrong. 

'97See Gödel (1949). For a very nice treatment of Gödel’s space-time see Malament (2012), $3.1. For a very 
simple (spatially) noncompact example, take the Minkowski hypercylinder M = {(x°,%) €IR*|0<x° < 1}/ ~, 
where (0,x) ~ (1,X), with induced Minkowski metric. Then /* (x) = I~ (x) = M for all x € M. 

198 Ind = 2 one has interesting disanalogies between compact Lorentz surfaces and compact Riemann surfaces; 
for example, the uniformization theorem looks completely different in the Lorentzian case (Weinstein, 1996). 
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5.4 Do geodesics extremize length? Local case 


Using (5.83), one may now define the length of a curve c : [a,b] — M in a Lorentzian manifold 
by generalizing (3.16) in the obvious way to the parametrization-independent expression 


L(c) = Fate = fa yleen lèt); (c general); (5.112) 


=} dt Be (c(t), é(t)); (c spacelike); (5.113) 


= | dt 80) (e(t),c(t)) (c timelike or causal); (5.114) 
=0 (c lightlike), (5.115) 


The formula (3.16) for the Riemannian case is the same as the spacelike case (5.113) here. It 
does not matter if we work with smooth curves or with piecewise smooth curves (where in the 
latter case (5.112) is defined by adding the smooth pieces in the obvious way), since we have: !” 


Lemma 5.8 Jf c is a piecewise smooth curve, there is a sequence (cn) of smooth curves such 
that cn(t) > c(t) and é,(t) — é(t) pointwise (in the topology of M and TM, respectively), and 


Die) = limL(cn). (5.116) 
n 
Moreover, if c is causal, then the approximating curves cn may be chosen so as to be timelike. 


Consequently, if we naively try to define a distance function on a Lorentzian manifold M by the 
expression that in the Riemannian case defines a proper metric in the topological sense, namely 


dr(x,y) := inf{L(c) | c : [a,b] > M, c(a) = x,c(b) = y}, (5.117) 


where the c are smooth or, equivalently, piecewise smooth curves, then dgR(x,y) = 0 for any 
x,y € M. Indeed, using covers by convex nbhds one sees that any two points can be connected 
by a piecewise smooth lightlike curve c, which has length L(c) = 0, see (5.114). According to 
Lemma 5.8, the infimum in (5.117) remains zero if it is taken over smooth curves. In case that x 
and y are spacelike separated (in the sense that they can be connected by a spacelike curve), this 
can be remedied by stipulating that the infimum in (5.117) be taken over all spacelike curves, 
blocking the lightlike construction. However, if x < y we have dr(x,y) = 0 even if we restrict 
the infimum in (5.117) to piecewise smooth causal curves, or even to smooth fd timelike curves. 
For causal curves, a more useful “distance” function is the so-called Lorentzian distance 


dı(x,y) := sup{L(c) | c : [a,b] > M, c(a) =x,c(b) = y}, (5.118) 
defined if x < y, where the supremum is over all fd causal curves from x to y. Eq. (5.84) implies 
d(x,z) > dı(x,y) + dr (y,z), (5.119) 


whenever x < y and y < z (which implies x < z), which is a reversal of the triangle inequality for 
a metric (Minguzzi, 2019, Theorem 2.32). Taking e.g. x = (0,0), y = (1,1) and z = (0,2) in 2d 
Minkowski space-time shows this dramatically, since dz (x,z) = 2 whilst dz (x,y) = dz(y,z) = 0, 
whereas with Euclidean metric dg one has dg(x,z) = 2 whilst dg (x,y) = dg (y,z) = V2. 


19See Lemma 4.6.1 and Corollary 4.6.1 in Kriele (1999). Also cf. Minguzzi (2019), §2.8, and Theorem 2.37. 
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To sum up, we see that in the Riemannian case, where our spatial intuition comes from, 
a detour (notably from a geodesic) increases the length of a curve between two given points 
x and y, whereas in the Lorentzian case it decreases the length of a causal curve, assuming 
y € J*(x). In particular, the critical points of the length functional on causal curves (namely 
causal geodesics) may be expected to maximize length (whereas in the Riemannian case they 
minimize length). This is indeed what happens, at least for nearby points (or small times):?"" 


Proposition 5.9 Let x € M and let y € U, be contained in a normal nbhd of x (cf. $5.2). 


1. Riemannian (R): x and y are connected by a unique (up to parametrization) geodesic y of 
minimal length compared to other curves c from x to y lying within Uy. 


2. Lorentzian (L): If y € In. (x) orye Ey. (x), then x and y are connected by a unique (up to 
parametrization) fd timelike or lightlike geodesic y, respectively, which in both cases has 
maximal length compared to other fd causal curves c from x to y in Ux. 


As in Theorem 5.5, a fd causal curve from x to y € E U. (x) must be a lightlike (pre)geodesic. The 
reason a lightlike geodesic from x to y € E a (x), which has zero length, can nonetheless be of 
maximal length in the said way is that all other causal curves from x to y have zero length, too. 


Proof. Before we discuss general Riemannian or Lorentzian manifolds, it is helpful to treat 
Euclidean space (R), i.e. E = IR? with metric g= diag(1, 1, 1) and Minkowski space (L), i.e. M = 
Rf with g = diag(—1,1,1,1), with norm (5.83) and length (5.112). The following considerations 
rely on the radius function r : M — R” and the radial vector field R on M, defined by 


r(z) := lizll; (5.120) 
R; := 2/|lz||, (5.121) 


where we identify with T,M with M. In R? one has R, = ð / ðr in polar coordinates. Note that 
8:(RzR:) = 1; (5.122) 


here and in what follows the plus sign is for (R) and the minus sign applies to (L). Without loss 
of generality we may put x = 0. For any y whatsoever (R) or any y such that x < y (L) the line 
y(t) = yt is a geodesic y: [0,1] — M from x to y, with length 


LQ) = [ars = l = 0). (5.123) 
For (L) we first do the timelike case. Take a fd causal curve c from x = 0 to y. Then 
ċ = +g(ċ,R)R +N, (5.124) 
where g(N ‚R) = 0, decomposes ¢ into a parallel and a normal component to R. It follows that 
lIcll? = g(ċ, R)? +8(N,N), (5.125) 
with g(N,N) > 0 also in (L), where the vector R is timelike, and hence N is spacelike. Hence 


llell = g(ċ,R); (R) (5.126) 
llell < -g(&;R), (L) (5.127) 


00 This is also suggested by the twin paradox of special relativity: the twin sister leaving earth and returning 
cannot always travel on a geodesic and hence her curve c has shorter length (experienced by her as proper time). 
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since for L we have g(&,R) < 0. The completion of the argument relies on the computation 


d od _d _ 4 8x(elt),e(t)) _ : 
ge) = Gl = Fey Eeee) le 
(5.128) 


Assuming c : [0,1] + M with c(0) = x = 0 and c(1) = y, together with the estimates (5.126) - 
(5.127) and (5.123), the above computation gives 


Lo) = [ariel > f arm(en.R) = 
= [ ae<- f dre (elt).®) = 


with equalities iff g(N,N) = 0 and hence, since N is spacelike, if N(t) = 0 for all t. In that case, 

¢ is proportional to R and hence to c, i.e. ¢(t) = A(t)c(t)/||c(t) || for some function A. This is 

solved by c(t) = zf (t) for some z € M and suitable f. Since c(1) = y this means that c(t) = 

yf(t)/f(1). Because y(t) = yt, this gives c = y up to reparametrization, i.e. c(t) = y(s(t)). 
In general, define the radius r : U, — R* and the radial vector field R on U, by 


r(exp,(Z)) = 


R 


roc =r(y) = L(y); (R) (5.129) 


roc =r(y)= L(y), (L) (5.130) 
0 


(5.131) 
expx(Z) exp, )5(Z) II 


A curve c : [0,1] — M from x to y in U, may be written as c(t) = exp,(C(t)). Then 


d u E gx(C(t),C(t)) 
“roe(t) = ZIIct )|| = tech) O = + aca 


Bcl) ((EXPx)C al aC) 
N (eXP,)c C(t n (C(t), (expec) (C(t))) 


where we used Gauss’s lemma in both the denominator and the numerator. On the other hand, 


(5.132) 


“the” geodesic within U, from x to y = exp,(Y) is given by w, where 


)= faci OPON IrI=r6). (5.134) 


since for geodesics y = “ the velocity ||Y(r)|| is t-independent. Eqs. (5.133) and (5.134) imply 


that the computation (5.129) - (5.130) can be repeated verbatim, once again yielding 


L(c) > L(y): (R) (5.135) 
L(c) < L(y). (L) (5.136) 


Finally, also the proof of uniqueness of y up to reparametrization reduces to the flat case, since the 
condition ¢(t) ~ R..,, comes down to C(t) ~ C(t). This completes the timelike case y € 7; (x). 
The case y € E (x) follows from the end of the proof of Theorem 5.5, which excludes timelike 
curves from x to y and forces the lightlike curves to be lightlike (pre)geodesics. 
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5.5 Do geodesics extremize length? Global case 


Being restricted to normal neighbourhoods Ux, Proposition 5.9 is local in nature; things may 
change beyond U, (for given x). Here is the crucial notion, which applies to the Riemannian case 
in general, and to timelike or spacelike (but not: lightlike) geodesics in Lorentzian manifolds. 


Definition 5.10 A conjugate point along a geodesic y: [a,b] > M is a point y(c), c € (a,b], 
for which there is a nonzero Jacobi field J along Y: |a,c| — M that vanishes at both a and c. 


A conjugate point is defined relative to y(a). It is independent of the parametrization of y. The 
point of interest is the earliest one, if it exists. Proposition 5.2 implies that J is orthogonal to ý. 


Proposition 5.11 A point z on a geodesic y: a,b] > M is conjugate iff the exponential map 
XP y(a) becomes singular at z (in that its derivative fails to be injective at the point y(c)). 


Proof. This easily follows from (5.31); we leave the details to the reader. 


Some intuition may come from the two-sphere, where the first conjugate point along a great 
circle emanating from (say) the South Pole is the North Pole, at which, all of a sudden, not one 
unique connecting curve of minimal length exists, but infinitely many. Beyond the North Pole, 
the initial great circle is not even the shortest one anymore, as one may go the other way round. 
We know from Proposition 5.1 that J arises from a variation of y, as in (5.10), but be aware 
that the boundary conditions J(a) = 0 and J(c) = 0 merely imply that the variations fix the 
endpoints of Ys as s — 0, so that (against the intuition from the two-sphere) the existence of J 
does not guarantee the existence of even one alternative geodesic from y(a) to y(c). 

Nonetheless, eq. (5.22) suggests that since L” (y) = 0 at a conjugate point, something happens 
to the extremization property of y. Proposition 5.11 confirms this, as it suggests that the local 
analysis of Proposition 5.9 may break down. The precise situation is as follows.” 


Theorem 5.12 7. Riemannian case: A geodesic y: a,b] — M locally minimizes the length 
of curves from Y(a) to y(b) iff there is no conjugate point on y that lies between x and y. 


2. Lorentzian case: A timelike geodesic y: [a,b] — M locally maximizes the length of curves 
from y(a) to y(b) iff there is no conjugate point on Y that lies between x and y. 


The “<=” part may be proved by remarking that, as we saw in §5.4, in the Lorentzian case 
timelike geodesics start out maximizing length, so that L’(y) < 0. According to (5.21), this 
remains the case until a conjugate point is encountered, so if this is never the case, one will have 
L"(y) < 0 forever (or at least as long as the geodesic is defined). Likewise in the R case. 

For the ‘=-” part, we show that the sign of L” (y) may indeed change once a conjugate point 
(at which its value is zero) has been crossed; in the L case, L” (y) then becomes positive, and a 
timelike geodesic can be constructed that is longer than the given one, whereas in the R case the 
opposite sign change leads to new and shorter geodesics between the given endpoints).”” 

Indeed, let c € (a,b), with associated Jacobi field J along y([a,c]) for which J(a) = 0 and 
J(c) =0. Then V,J(c) 4 0 (since otherwise J = 0), and by Proposition 5.1 there exists a 


20! The word ‘local’ here means that y([a,b]) has a nbhd U (in M) such that y does or does not minimize or 
maximize length in comparison with all curves in U, i.e. with respect to “nearby” curves only. 

202The remainder of the proof is based on the final part of the proof of Hawking & Ellis (1973), Prop. 4.5.8. For 
alternative proofs see Jost (2002) Theorem 4.3.1, for the Riemannian case and O’ Neill (1983), Proposition 10.10 
and Theorem 10.17, Wald (1984), Theorem 9.5.3, or Minguzzi (2019), Theorem 6.16, for the Lorentzian case. 
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one-parameter family of geodesics (Y,) for which J = Y=o- Since only the component of J that 
is orthogonal to ý is relevant, we can make J orthogonal to y altogether, cf. the discussion after 
the statement of Proposition 5.1. Furthermore, we extend J from y(|a,c]) to y([a,b]) by making 
it zero on (c,b]. Now find some vector field K along y: [a,b] — M that is also orthogonal to y 
and in addition satisfies the boundary conditions 


K(a) = K(b) =0; (5.137) 
8y(a) (Vid, K) = 0; (5.138) 
8y(c) (Vid, K) = ~v. (5.139) 


This is possible, since unlike the Jacobi field J, the vector field K is not meant to satisfy any 
particular equation. We now take € > 0 and consider the vector field 


M=eK+e~'J. (5.140) 


For any family of curves for which Ys=0 = M, we then compute the second variation (5.21), 
in which by construction yÍ is replaced by M. Since J satisfies the Jacobi equation, the term 
proportional to €~7, which only involves J, vanishes. The term proportional to €’, which 
only involves K, stands; call it C €? (where C may have either sign). One of the cross terms 
proportional to €- £€7! = 1, involving each of J and K linearly, vanishes by the Jacobi equation 
for J. In the L case to be specific (where the - sign in (5.21) has to be deleted), the other cross 
term contributes 


1 j i 5 
L"(y) = Ce? + -f dt gyr) (I(t), VFK (t) — (H(t), K (E) ) YC). (5.141) 
Here, using (3.65) and (3.52), we have 
d 
gya (I(t). V7K(t)) = q oe) CEVK) — 8y) (Ved (1), VKC), (5.142) 


of which the first term vanishes upon integration, as J(a) = J(c) = 0. The second term gives 


d 
8, (VIC), ViK(t)) = Er VIKE) + 8y) (V7J(t);K(t)), (5.143) 
whose last term combines with the curvature term in (5.141) to contribute 


8y(1)(K(t), Ved (t) - AH I (E)E), 


which vanishes by the Jacobi equation for J (using the symmetries of the Riemann tensor R). 


Finally, the first term in (5.143) gives, upon integration, +1, so that overall we obtain 
L"(y) =Ce* +1. (5.144) 


Whatever the sign of C, for € small enough we can arrange L”’(Y) > 0, and so, since it started 
out negative, the sign of L” (y) has changed across a conjugate point, as claimed. 

It is by no means excluded that there may be other variations for which L” (y) remains 
negative; for example, by picking some K for which the sign in (5.139) is positive. All that has 
been proved is the existence of a family of variations for which the sign does change, which is 
enough to prove the theorem. A more comprehensive and systematic way to handle this situation 
is to introduce the index form for the second variation of L, which, across a conjugate point, loses 
its property of being negative definite (L) or positive definite (R). 


203 See e.g. Jost (2002), §4.2 or O’ Neill (1983), chapter 10. 
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Theorem 5.12 gives necessary and sufficient conditions for the existence of length-minimizing 
geodesics in the Riemannian case and length-maximizing timelike geodesics in the Lorentzian 
case, but we need to find out when these conditions are met. In the Riemannian case this is 
settled by the second part of the Hopf-Rinow Theorem 3.4 in $3.2. The point is that prior to 
this theorem we only knew that by definition a given geodesic extremizes the length function 
c++ L(c) defined by (3.16) compared to local variations, and that, still locally, it minimizes 
length until a conjugate point is encountered (and fails to to so afterwards). Theorem 3.4 is a 
different kind of statement: it guarantees that some curve between any two given points x and y 
exists that globally minimizes L(-), i.e., not merely compared with nearby curves from x to y, 
but among all curves from x to y, and then this curve must be a geodesic by definition. 

There is no full Lorentzian analogue of this, and for the result that comes closest (i.e. Theorem 
5.30), geodesic completeness has to be replaced by global hyperbolicity, a concept that is unique 
to Lorentzian geometry (see §5.7). But we first return to the local result Proposition 5.9.2. A 
priori there seem to be four possibilities, which one could organize into a 2 x 2 matrix: 


e For y € J? (x) we have (i) either y € J* (x) or y ¢ I* (x), and (ii) some fd causal curve y 
from x to y either does or does not maximize L(-) among all fd causal curves c from x to y. 


The key insight is that, as in Minkowski space-time, the second options cannot go together: 
although Proposition 5.9.2 itself does not hold globally, the following consequence of it does. 


Proposition 5.13 If y € J (x) and y is a fd causal curve from x to y, then there are three 
mutually exclusive possibilities (where the reference curves c are causal and go from x to y): 


1. y €I* (x), and there is a timelike curve c with L(c) > L(y); 


2. y €I* (x), and y maximizes L(-) among all c, so that y is a timelike (pre) geodesic; 


3. y I" (x), and y is a lightlike (pre)geodesic that maximizes L(-) among all c. 


Proof (sketch).”° Since y lies in a compact set in M one can pick finitely many points x1,...,xN 
on y (where xı = x and xy = y) and a cover of y by pairwise overlapping normal nbhds U,,, 
i= 1,...,N — 1, such that x;+1 € Ux, and U,, contains the entire segment of y from x; to xj+1. 
First, if just one segment is timelike, then y € J* (x). To see this, assume the segment from 
x to xg+1 is timelike, so that xgı ı € IF (xg). If x42 € ET (xx 1), then, since /* (xx) is open, 
one can move x; so as to keep the segment x, — x... timelike whilst making the segment 
Xk+1 — Xp+2 timelike, too. If necessary this can be repeated for all future and past points (relative 
to x;), yielding a timelike curve x — y. Hence for case 3 in the proposition to arise, y must be a 
lightlike curve, upon which Theorem 5.5.2 shows it must be a (pre)geodesic. The only causal 
curves from x to y € ET (x), then, are lightlike curves, so that y, having length zero, trivially 
maximizes L over all other causal curves from x to y, since these also have zero length.”"® 


204By definition (pre)geodesics are at least C!, so that a curve consisting of segments of lightlike geodesics with 
corners is not a (lightlike) geodesic. Otherwise, any two points could be connected by a lightlike geodesic. 

205See Minguzzi (2019), Theorems 2.20 and 2.22, for details. In its most general form the proposition is valid for 
continuous causal curves, see Definition 5.20 below, and indeed the proof is best understood in that light. 

206 This suggests that E* (x) might be a good global analogue of the fd lightcone .%* in T,M, where y € E+ (x) if 
and only if there is a lightlike geodesic from x to y. But in general this implication is only valid from left to right, 
as Proposition 5.13 shows: with strong focusing of light rays (e.g. near a black hole) two points x and y may be 
connected by a lightlike geodesic as well as by a timelike curve, so that y € I* (x) and hence y ¢ E+ (x). 
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The case distinction between 1 and 2 is then trivial, since if y maximizes L even globally, then it 
certainly does so locally, so that it must be a geodesic (i.e. case 2). 


Corollary 5.14 17. A fd causal curve from x to y € E* (x) is a lightlike (pre) geodesic. 


2. A fd causal curve from x to y € J* (x) that does not enter I* (x) is a lightlike (pre) geodesic. 


No. 1 is case 3 of Proposition 5.13.3. This implies no. 2, whose hypothesis forces y € E* (x). 


The following ideas will play a key role in causal theory, culminating in their relevance to 
the abstract theory of black holes (see chapter 10). We call a subset S C M achronal if 


I*(S\NS =6: & I+(S) Ar (S) = 0; (5.145) 
that is, if no two points of S can be connected by a timelike curve. Corollary 5.14.1 then gives: 


Corollary 5.15 Every causal curve in an achronal set is a maximizing lightlike (pre)geodesic.””’ 


At the other extreme, sets with timelike curves clearly cannot be achronal, so that the above 
corollary covers the situations between spacelike and timelike and therefore is not very surprising 
at all. The few cases where it is nontrivial include the following: if A C M, then 


S =I" (A), (5.146) 
called an achronal boundary ,?°® is indeed achronal. To see this, first note the implication 
x€0It (A) > It (x) CI*(A). (5.147) 
Indeed: 
e Ifye/*(x) then I~ (y) is a nbhd of x and hence I~ (y) N1* (A) is not empty. 
e Ifz €I (y) then y €/*(z), and if also z € J*(A) then y € J* (A) by transitivity of I*. 
e Hence if x,y € 017 (A) satisfy y € I" (x), then y € I+ (A), which contradicts y € dJ* (A), 


since Jt (A) is open. 


Corollary 5.16 For any A C M for which J* (A) is closed, each x € 0I* (A)\A lies on a lightlike 
(pre)geodesic. Thus 0I*(A)\A is a null hypersurface, which is ruled by lightlike geodesics. ı 


Proof. If J* (A) is closed, then /* (A) = J* (A) by Proposition 5.4.5. Eq. (5.97) gives 
al (A) =aF (A) as" AIG), (5.148) 
since J+ (A) is open by Proposition 5.4.2 and JT (A) is closed by assumption. Since 
J? (J (A)) =JĦ (A), (5.149) 


see (5.94), there is a causal curve c through any x € J+ (A). Corollary 5.15 then applies. 


Using Lemma 5.26 below this can be shown more generally for A closed,?!? but the stated 
version is enough for Penrose’s singularity theorem as well as various other applications. 


207 An achronal set need not contain any causal curve at all; it can be spacelike, e.g. x? = c in M. But a spacelike 
hypersurface need not be achronal either! Juts take a Lorentzian cylinder with a slowly creeping up spacelike line. 

208More generally, an achronal boundary is a set OF where F is a future set. See $6.4 and §10.7. 

209 See Definition 4.15. This means that a unique lightlike geodesic passes through every point of O/+ (A)\A. 

210See Proposition 10.16 in §10.7 below, which also gives more information about the lightlike geodesics. 
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5.6 Properties of causal curves 


Many concepts in causal theory, like the Cauchy surfaces to be studied in §5.8, as well as the 
singularity theorems in chapter 6, rely on some specific definitions and properties of curves. 


Definition 5.17 e A curve c: |a,b) > M, where a < b < œ, is future extendible iff 
limp c(t) exist in M. If not, c is is future inextendible.*'' Likewise for c : (a,b) > M. 


e A curve c: (a,b| > M, where — <a < b, is past extendible iff lim,),c(t) exists. Other- 
wise, c is past inextendible. Likewise for c : (a,b) > M. 


e A curve c: (a,b) —M is inextendible if it is both future and past inextendible. 


Briefly: c : I — M is inextendible if / cannot be increased (whilst keeping c continuous). 


Although so far all curves are smooth by assumption, this definition obviously applies to curves 
that are merely continuous, and indeed is much more natural for that class. Since a continuous 
curve c : [a,b] — M is always continuously extendible to c : [a,b + €), for some € > 0, only the 
case [a,b) is interesting for future (in)extendibility. Then c : [a,b) — M is future extendible iff it 
has a continuous extension c : [a,b] — M.?!? Likewise for past (in)extendibility at a. 
Intuitively, the (future) limit lim,;, c(t) may not exist for three different reasons: 


1. The curve moves off to infinity. For example, for b = œ define c : R > R by c(t) = +. But 
this may also happen in finite time: take e.g. c(t) = 1/t in M = R, with I = (a,0). 


2. The would-be limit point does not exist in M. For example, take the curve 


c:[0,1) > R\{1}; ei), (5.150) 


3. The image of c lies in a compact set, where c continues to wander around all the different 
limit points of its convergent subsequences, never settling. A typical examples is the curve 


c: (0,1) > R?; c(t) = (t,sin(1/(1—1)), (5.151) 
which is contained in the compact set [0,1] x [—1, 1] (but has infinite arc length). 


A geodesic y: (a,b) — M, where a < 0 < b, is a solution to the geodesic equation (3.24) 
with given initial values y(0) and 7(0). It is called future complete if b = œ. This is not a 
priori a pointwise property, as in Definition 5.17, but nonetheless it can be shown that solutions 
y to (3.24) whose domain is maximal are precisely geodesics that are inextendible as curves.*!° 


Definition 5.18 A geodesic y: (a,b) — M is future incomplete iff it is future inextendible and 
b < œ. Similarly, y is past incomplete iff it is past inextendible and —œ < a. Finally, y is 
incomplete if it is either future or past incomplete, or both. 


21! Equivalently, an endpoint of c : [a,b) — M is a point z € M such that for any nbhd U of z there is s € [a,b) such 
that c(t) € U for all t > s. Then c is future (in)extendible iff it has an (has no) endpoint. This criterion is especially 
useful if b = co, For this reason a (future/past) inextendible curve is sometimes called (future/past) endless. 

*121f b = oo, in order to define continuity of c we say that U C [a,] is open iff it is the complement in [a,°9] of a 
compact set in [a,°). In other words, topologically [a,] is the one-point compactification of [a, 20). 

213 By Proposition 2.5.6 and Theorem 2.5.7 in Chrusciel (2020), any fd causal/timelike geodesic has an inextendible 
causal/timelike extension, which is maximal as a solution to the geodesic ODE with given initial values. 
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Proposition 5.19 A timelike geodesic y: (a,b) > M with —» < a is future incomplete iff it is 
inextendible and has finite arc length, and similarly for past incompleteness, provided b < œ. 


Proof. Timelike geodesics are parametrized by arc length (up to affine reparametrizations). 
In causal theory one often needs approximations that require continuous causal curves. This 
combination of words sounds problematic, because causal properties of curves c, which so far 
were (piecewise) smooth by convention, were defined through their tangent vectors ¢(t), which 
require differentiability of c(t). Nonetheless, the following definition makes good sense.?!4 


Definition 5.20 Let I C R be an interval, which may be (semi) closed or open, and possibly 
(semi) infinite. A continuous curve c : I — M is fd causal if every point x = c(t) on the curve 
(t € I) has anormal nbhd U, (cf. $5.2) such that the unique geodesic connecting x with any later 
point y € U, (with y = c(t’) for t' > t) is fd causal. Similarly for pd causal curves. 


All relevant results so far, like Proposition 5.13, are true for continuous causal curves. To analyse 
such curves we introduce an auxiliary complete Riemannian metric h on M (which always 
exists),”!> with associated topological metric dp defined in the usual way, cf. (3.30). Then: 


Proposition 5.21 A continuous curve c: I — M is fd causal iff (possibly after reparametrization) 
it is absolutely continuous and a.e. differentiable on I with ¢ fd causal, and, for all |s,u] € I, 


1, (ei) = / “ib yf AGEL es (5.152) 


Proof (sketch).?'° We only sketch the inference from left to right. Take x = c(s) any y = c(u) 
close enough that they both lie in a convex set U with coordinates (x") in which the metric 
is ds? = —goodt? + g;;dx'dx! and the interpolating geodesic has Y’(t) = t. Since y is causal, 
we have g; iv y < goo, and since (g; j) is positive definite and U has compact closure, we have 
gijx'x! > CY;(x)” for some C > 0. In the Euclidean distance d(x,y)? = Lu |x” —y"|* on U, 


d(e(s),c(w))? = ar), rw)? =E -POPE S aor 
H pe 


SEJ aos (1482) (u-s}. (5.153) 


Thus c is locally Lipschitz and the claims about ¢ follow from Rademacher’s theorem. 


Since the function u Ly, G [s 4) is strictly increasing and hence invertible, any continuous 
causal curve c may be parametrized by , h-arc length, i.e., via one of the equivalent conditions 


h(én(t),én(t)) = 13 Li (eis) =u-5. (5.154) 


By the same token, the length functional (5.112) can be defined and is finite. 


214We follows Minguzzi (2019). For a slightly different (locally Lipschitz) approach see Chruściel (2011, 2020). 

215 Recalling from §2.1 that our manifolds M are paracompact, a partition of unity argument gives the existence of 
some Riemannian metric on M (Jost, 2002, Theorem 1.4.1). If M is compact, then his complete, cf. Theorem 
3.4. So assume M is noncompact and h is incomplete. We follow Nomizu & Ozeki (1961). For x € M, define 
r(x) = sup{r > 0 | B,(x) is compact}, where B,(x) = {y € M | d,(x,y) <r}. If r = œ for some x, then M is 
compact, so r < œ, Take any smooth function œ : M — R such that @(x) > 1/r(x). Then h = œ? is complete. In 
particular, any incomplete Riemannian metric can be conformally rescaled so as to become complete. 

216 For a complete proof see Theorem A.1 in Candela et al. (2010). See also Chruściel (2011), Theorem 2.3.2. 
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For later use, we now present a number of technical results, in which (M, g) is just a space-time. 


Lemma 5.22 Let a fd continuous causal curve c : (a,b) — M be parametrized proportional to 
h-arc length. Then b = œ iff c is future inextendible, and a = —» iff c is past inextendible.*'/ 


Approximations of curves can most naively be done pointwise, as in Lemma 5.8, but this 
loses even continuity. Instead, we use the auxiliary metric h to define uniform convergence.7!® 
Despite the square brackets, we agree that in intervals [a,b] both a = — and b = ~ are allowed. 


Definition 5.23 For curves cn : [an,bn| > M and c : a,b) — M, uniform convergence cn — c 
(on compacta) means that for every compact interval |a',b'| C [a,b] there is a sequence of 
compact intervals |a, b'i] C [an, bn] such that the following three conditions hold, as n — œ: 


a, a’; b, 0; sup  dr(cn(t),c(t)) + 0. (5.155) 


tE [a,b |n[a’,b’] 


This turns out to be independent of the choice of h. Uniform convergence preserves continuity of 
curves, and in addition it preserves causality and (forward or backward) directedness.!? 

A very important result, to be used e.g. in the proof ofTheorem 5.30, is upper semicontinuity 
of the Lorentzian length functional. For (fd) causal curves c, L(c) was defined by (5.114), i.e. 


Eye / ai Be) (€(0)€(8)): (5.156) 


By Proposition 5.21, this expression is even defined for continuous (fd) causal curves. 


Lemma 5.24 Let c be a fd continuous causal curve c : |a,b| —> M, parametrized proportional to 
h-arc length. Then any sequence (cn) converging uniformly to c, as in Definition 5.23, satisfies 


limsupL (cn) < L(c). (5.157) 


The idea behind this lemma comes from the following figure,” which shows that one may 
decrease the length of a causal curve c at will by adding a chain of almost lightlike directions. 


217 See Minguzzi (2008), Lemma 2.6. This is even true if c(a,b) is precompact, in which case c cannot have 
an endpoint if it is future inextendible and hence wanders around its limit points, indefinitely increasing L,(c). 
There is a potential ambiguity if we apply this to geodesics. These are affinely parametrized, whereas causal curves 
are parametrized by h-arc length. But by Proposition 2.5.6 in Chrusciel (2020) a geodesic is inextendible iff it is 
inextendible as a causal curve (in his locally Lipschitz sense, but see footnote 216). See also footnote 213. 

218 Our discussion is based on Minguzzi (2008a, 2019). It is also possible to do such approximations without an 
auxiliary metric, see Hawking & Ellis (1973), Lemma 6.2.1, and O’ Neill (1983), chapter 14, but this is contrived. 

219See Lemma 2.7 in Minguzzi (2008). However, the limit curve c need not be parametrized by h-arc length. 

220Fi gure redrawn from Penrose (1972), page 50, Fig. 43, by Edith de Jong. 
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On the other hand, increasing its length can only be done by adding timelike pieces, which be- 
comes ever more difficult as one approaches c and hence will not lead to large length differences. 
Nonetheless, Lemma 5.24 is difficult to prove and we will just talk the reader through it.” 


Lemma 5.25 The length of any continuous causal curve c can be approximated as 
ies inf L(y), (5.158) 


where the infimum is over all interpolations of c by piecewise smooth causal geodesics y.” 


Intuitively, by Proposition 5.9 timelike segments of y can only increase the length of the piece of 
c they interpolate, whereas lightlike segments cannot decrease it. This explains the infimum in 
(5.158). We may unfold the meaning of eq. (5.158) in Lemma 5.25: it states that 


1. L(c) < L(y) for all piecewise smooth causal geodesics 7; 
2. For any € > 0 there is a y such that L(y) < L(c) + €/2. 


Applying the first point to c„ close enough to c shows that for every € > 0 there is N € IN such 
that for all n > N one has L(c„) < L(y) + €/2. Hence 


L(cn) < L(y) +€/2 < L(c) +€/2+8€/2=L(c) +€. (5.159) 


This proves Lemma 5.24, which is one of the keys to Theorem 5.30 below on the existence of 
length-maximizing geodesics (which in turn leads to Hawking’s singularity theorem 6.4). 

Finally, we will need the following version of the limit curve lemma, for which we ask the 
reader to first read Definition 5.27 of global hyperbolicity on the next page.””° 


Lemma 5.26 If (cn : [0, bn] + M) is a sequence of fd continuous causal curves parametrized by 
h-arc length in a globally hyperbolic space-time such that c„(0) — x and cn(bn) + y # x, there 
exists a fd continuous causal curve c : [0,b] — M, where b < ©, as well as a subsequence of (cn) 
that converges uniformly to c (cf. Definition 5.23), including b —> b at the endpoint. 


This lemma ultimately derives from the Arzela—Ascoli theorem. The role of global hyperbolicity 
is just to exclude the possibility that b,, wanders of to infinity, in which case one would have 
an inextendible fd causal limit curve c : [0,°°) — M that sort of circles around y without ever 
reaching it,””* cf. Lemma 5.22. Removing the assumption of global hyperbolicity allows this. 


221 Complete proofs may be found in e.g. Penrose (1972), Theorem 7.5, and Hawking & Ellis (1973), Lemma 
6.7.2 (both in the setting of Theorem 5.34.1 below). We follow Minguzzi (2019), Theorems 2.37 and 2.41. 

222This means that the smooth segments of y are causal geodesics that have endpoints on y and are contained in a 
convex nbhd that also contains the segment of y they connect. Just think of picking sufficiently many points on c 
such that each adjacent pair lies in a common convex nbhd, connect this pair by a unique geodesic, and continue. 

223 Lemma 5.26 is a special case of case (i) of Theorem 2.53 in Minguzzi (2019), whose case (ii), where b = œ, is 
excluded by our assumption of global hyperbolicity. The need for varying b, is due to the chosen parametrization by 
h-arc length. In his framework of locally Lipschitz causal curves, Chrusciel (2020), Proposition 2.6.2, has a simpler 
version of Lemma 5.26, according to which any sequence (c : [0,°°) — M) for which c„(0) — x and for which there 
is a constant L > 0 such that L~!|t —t'| < dy(cy(t),en(t’)) < L]t —t'| for all n and all r,r’ € [0,b] (assuming b < 09), 
has a subsequence converging to some limit curve in the sense of Definition 5.23. 

224 Global hyperbolicity excludes compact sets K C M that contain fd continuous causal curves c : (0,00) + K. 
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5.7 Global hyperbolicity 


In our context of analyzing geodesics, global hyperbolicity arises as an assumption in Theorem 
5.30, which as such propagates into Theorem 6.4. It is also a key assumption in Penrose’s 
singularity theorem. Indeed, global hyperbolicity is central to almost all of mathematical GR.?”° 


Definition 5.27 A space-time (M, g) is called: 


1. non-imprisoning if no future inextendible fd causal curve c is contained in a compact 
subset of M (i.e., if every such curve c “eventually wanders off to infinity” ).?° 


2. globally hyperbolic if it is non-imprisoning and each double cone (or causal diamond) 
Ie = tae milano) (5.160) 
is compact. N.B. by Proposition 5.4.1, J(x,y) # 0 requires x < y, i.e. y EJ* (x). 


We give various alternative characterizations of global hyperbolicity in the next section. But first, 
to elucidate the role of non-imprisonment we compare it to three related assumptions:7~’ 


Definition 5.28 A space-time (M,g) is called: 


1. causal if it contains no closed causal curves.’ 

2. strongly causal if any nbhd U, of any x € M contains an open nbhd Vy such that any causal 
curve with endpoints in V, entirely lies in V; (as opposed to: leaving it and returning). 
Equivalently, if c : I — M, the set {t € I | c(t) € Vx} is connected. 


3. non-partially imprisoning if there are no inextendible causal curves c that continue to 
return to some compact set, although they may also continue to leave it (technically: there 
exists no compact set K C M for which the parameter set c~! (K) CRis non-compact). ”? 


The meaning of causality should be obvious; its violation is associated with all kinds of “(mur- 
dering one’s) grandfather” paradoxes. Strong causality is a form of causality stabilized against 
perturbations of points: there aren’t even any causal curves that start at x and end at points y 
arbitrarily closely near x, except the very short direct causal curves from x to y (if y € J=(x)). 
Partial imprisonment is a weakening of imprisonment, and the logical implications are:7°” 


strongly causal > non-partially-imprisoning > non-imprisoning = causal. 


225Qur definition of global hyperbolicity (which simplifies some arguments in §5.8) is equivalent to the usual one 
in which non-imprisonment is replaced by strong causality. See Minguzzi (2019), Proof after Definition 4.117. 

26This is equivalent to the same condition with future/fd changed into past/pd, see Minguzzi (2019), page 119. 

27See Minguzzi (2008b, 2019). This is part of the causal ladder (Minguzzi & Sánchez, 2008). Imprisonmment 
(under the name of total imprisonment) and partial imprisonment were introduced by Carter (197 1a). 

?28One also says that (M, g) is chronological if there are no closed timelike curves. 

22°For an imprisoned curve c, the set c7! (K) is by definition the entire parameter space 1. 

230The first implication is Proposition 4.80 in Minguzzi (2019), the second is trivial from the definitions, and the 
third is Proposition 4.37 in Minguzzi (2019). The implication strongly causal => non imprisoning is also obvious 
in the contrapositive; for any inextendible fd causal curve c : [a,b) — K contained in a compact set K has a limit 
point x as t — b, but it has no endpoint, and so any limit point provides a counterexample to the definition of strong 
causality. See Hawking & Ellis (1973), p. 195 or Minguzzi (2019, §4.3.1) for a (contrived) example of a (totally) 
imprisoning space-time; but all examples must be pretty pathological. 
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Compactness of J(x,y) holds in Minkowski space-time. In curved space-times it allows 
“interesting” singularities but blocks “trivial” ones, as in the case where J (x,y) fails to be compact 
by missing a point in its interior. In that case, there is a fd causal curve from x that disappears 
into this point. This curve lies in the past of y and hence is visible for an observer at y, making 
the hole a (locally) naked singularity. Global hyperbolicity prevents this possibility. See $10.4. 


Technically, the following equivalences will be useful (proved by elementary topology):7>! 


Lemma 5.29 Let (M,g) be a space-time. The following properties are equivalent: 
1. Each double cone J+ (x) NJ- (y) is closed. 
2. All sets J*(x) are closed. 
3. All sets J” (K), where K C M is compact, are closed. 
Also, (M,g) is globally hyperbolic iff JY (K) AJ- (L) is compact for any compact K,L C M. 


The following key result is independently due to Avez and Hawking.””” To see the need for 
global hyperbolicity in this, note that in Minkowski space-time with the origin removed, no point 
(x°,0) with x? < 0 can be connected to (x°,0) with x? > 0 by a geodesic at all. 


Theorem 5.30 If (M,g) is globally hyperbolic, then any x € M and y € J* (x) can be connected 
by a fd causal geodesic of finite length, which length is maximal among all fd causal curves from 
x to y. Ify € I” (x) then this maximizing geodesic is timelike, and if y € E* (x) it is lightlike. 


Proof. All curves are continuous fd causal. Recall (5.117) and (5.118). Since L(y) and hence 
dı(x,y) are parameter-independent, we may assume all our curves to be parametrized by h-arc 
length. We also assume that all curves start at? = 0 with c(0) =x. All curves c from x to y lie in 
J (x,y), which is compact by assumption. In J(x,y) we may choose our auxiliary Riemannian 
metric h such that ||X ||, < ||X ||, for all causal vectors X. Lemma 5.31 below then gives: 


dı(x,y) < dr(x,y) < %. (5.161) 


Now take a continuous fd causal sequence (cp) for which sup, L(cn) = d(x,y), and hence also 
limsup,L(cn) = di (x,y). By Lemma 5.26 this sequence has a limit curve c : [0,b] — M with 
c(b) = y. Lemma 5.24 gives 


L(c) < dı(x,y) = sup L(cn) = limsupL(c„n) <L(c), (5.162) 
so that L(c) = dz (x,y). Hence c achieves the supremum in (5.118) and has maximal length. 


Proposition 5.13 then makes c a (smooth) causal pregeodesic with the claimed properties, ”* 
predicated on y € I+ (x) or y € E* (x). Finally, reparametrization turns it into a geodesic. 


Lemma 5.31 Jf (M, g) is non-imprisoning, then for any compact subset K C M there is a constant 
0 < Cx < œ such that for any continuous (fd) causal curve c : [0,b] + K or c : [0,b) > K, 
parametrized by h-arc length, one has the uniform bound L,(c) < Cx. In particular, b < œ. 


231 See Minguzzi (2019), Theorem 4.12, for part 1, and Galloway (2014), Proposition 4.3, for the last claim. 

232See Avex (1963) and Hawking (1966/2014), his Adams Prize Essay, expanded into Hawking & Ellis (1973). 

233 See Minguzzi (2019), Theorem 2.55. 

234This also follows from Theorem 2.20 in Minguzzi (2019), which states that each maximizing causal curve is a 
causal geodesic. As in Proposition 5.13, the proof is done by localization to convex nbhds. 
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5.8 Cauchy surfaces and Cauchy horizons 


In this section we give various equivalent characterizations of global hyperbolicity.**> These 
look quite different from each other, which is very useful in both causal and PDE theory and also 
illustrates the richness of the concept. First, for x < y (i.e. x € M and y € J? (x)), define C(x,y) 
as the space of continuous fd causal curves c from x to y that are defined on J = [0, 1] and are 
parametrized proportional to h-arc length, where h is a complete Riemannian metric, as in the 
previous section: hence h(é,(t),én(t)) is constant a.e. in t, though not necessarily 1. Then 


d(c1,€2) = en (t),c2(t)) + ILn(c1) -Ly(c2)| (5.163) 
te|0,1 


turns C(x,y) into a metric space; the first term makes the evaluation map 
ev: C(x,y) x [0,1] > M; evict) = c(t) (5.164) 


continuous, upon which the second term also makes the Riemannian curve length functional 


h= f CAE (5.165) 


continuous.”°° Leray’s original definition of global hyperbolicity was essentially that each 
space C(x,y) be precompact in the compact-open topology borrowed from C((0,1],M).?°’ 
Surprisingly, this is equivalent to the existence of a Cauchy surface, which we define now. 


235 See Leray (1953), and Choquet-Bruhat (2014) for some history. See also Choquet-Bruhat (2009), chapter XII. 
Our approach combines elements of Choquet-Bruhat, loc. cit., with Theorem 4.1 in Sanchez (2007). 

236This makes convergence in the metric d stricter than uniform convergence in dy, (as in Definition 5.23, in which 
Ln is generally merely lower semicontinuous, analogously to upper semicontinuity of the Lorentzian length L). 
The metric (5.163) was introduced by Bott & Mather (1968, p. 474) and is also used by Choquet-Bruhat (2009), 
§XII.8.2 (who works with the class of rectifiable continuous causal curves). Since J = [0, 1] is compact, the first 
term in (5.163) gives the compact-open topology, see Clarke (1993), §6.2.2. One may wonder why things are so 
complicated. The reason is that if one allows arbitrary parametrizations of curves all hope of compactness of C(x, y) 
or its closure are gone. The approach chosen in the main text (following Bott, Choquet-Bruhat and Sanchez) is 
one way around this problem by introducing preferred parametrizations, at the cost though of the unusual metric 
(5.163). Alternatively, Penrose (1972) and Hawking & Ellis (1973) work with the space Ĉĉ (x,y) of continuous fd 
causal curves up to reparametrization, i.e. one uses the image c([0,1]) in M rather than the function c : [0,1] > M. 
This image space is topologized by letting any open nbhd of c (more precisely, its image in M) consist of all fd 
causal curves y whose image lies in some open nbhd of c(|0, 1]) in M. This topology is very natural and coincides 
with the quotient of the compact-open topology on C([0, 1], M) to the image space, see again Clarke (1993), §6.2.2. 
However, unlike the approach in the main text, this procedure hardly makes sense when (M, g) is not causal, since 
in that case loops traversed any number of times are identified (since they have the same image in M), although 
they are clearly different things. Thus one assumes causality from the outset, indeed even strong causality, in 
which case C (x,y) need not be completed and global hyperbolicity is characterized by compactness of C (x,y). See 
Penrose (1972), §6 or Hawking & Ellis (1973), Proposition 6.6.2. Furthermore, Lemma 5.24 remains valid, mutatis 
mutandis, in that L is upper semicontinuous, i.e. for each c € C(x, y) and each € > 0 there is a nbhd T of c such that 
L(y) < L(c) +€ for all y € T. See Penrose (1972), Theorem 7.5 or Hawking & Ellis (1973), Lemma 6.7.2. 

237 The topology induced by the metric (5.163) is not defined on all of C([0, 1],M) because of the second term; the 
first term would recover the compact-open topology. Restricted to C(x,y), the metric topology given by (5.163) 
is of course finer than the compact-open topology, so that C(x,y) is also complete in the metric (5.163), and 
indeed C(x,y) is a good model of the abstract (Cauchy) completion of C(x,y) defined for any metric space. This 
is necessary because despite Lemma 5.26, the space C(x,y) is not itself complete in the metric (5.163), since, 
as already mentioned in footnote 217), uniform limits of sequences of continuous causal curves parametrized 
proportional to h-arc length need to be parametrized in that way and hence may disappear from C (x,y). 
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Definition 5.32 A Cauchy (hyper)surface in a space-time (M,g) is a subset & C M with the 
property that each inextendible timelike curve intersects & in exactly one point. 


An easy example is the x? = 0 hypersurface in Minkowski space, which may be tilted or curved, 
as long as all tangent vectors remain spacelike. But neither the hyperboloid H} defined in (4.88) 
nor even the forward lightcone ðI* (x) is a Cauchy surface. Here are some first results: 


Theorem 5.33 Let (M,g) be a space-time with Cauchy surface & C M. Then: 
1. Zisa closed connected achronal 3d topological submanifold of M. 
2. Any other possible Cauchy surface in M is homeomorphic to =X. 
3. M is homeomorphic to R x È. 


We will give more precise results Theorem 5.44, especially concerning the possible smoothness 
of all constructions and the existence of spacelike Cauchy surfaces, but for a first acquaintance 
with Cauchy surfaces the above facts are enough (and indeed historically they were the first to be 
established).7°* The first claim is technical,” except for achronality which is trivial, but the 
second and third are fairly intuitive. If £ and X’ are both Cauchy surfaces, then any inextendible 
timelike curve meets each of them once. In particular, the integral curves of a complete timelike 
vector field T on M, such as the one defining its time-orientation,”*" give an identification of 
È and X’. Similarly, since the integral curves c of T are topologically R (see Lemma 5.22, 
extended also in the backward direction), we obtain a map 


Rxiu-M; (t,o =c(0)) > c(t), (5.166) 


that is, & is moved (forward or backward in time) with the flow of T. This map is a bijection 
by definition of a Cauchy surface, and can be shown to be a homeomorphism, like the above 
bijection & = X’. Such arguments can be made rigorous once we have time functions (see §5.9). 


Theorem 5.34 Each of the following conditions is equivalent to global hyperbolicity of (M,g): 
1. The space C(x,y) is precompact for all x < y (i.e. its closure C(x,y) is compact). 


2. For each x < y there is a constant Kyy < œ such that for all c € C (x,y) one has 


ne) Kr (5.167) 


3. M has a Cauchy surface. 


A detailed proof of this theorem takes many pages and is hardly instructive, except for explaining 
how the various assumptions are related to each other. Short of giving a complete proof, our goal 
is therefore merely to sketch these relations, and refer those who want more to the literature. *! 


238 The theory of Cauchy surfaces in GR was initiated by Geroch (1970) in the topological setting; see also 
Hawking & Ellis (1973), chapter 6 and O’ Neill (1983), chapter 14. This theory was extended to the smooth case by 
Bernal and Sanchez (2003, 2005, 2006a); see also the reviews Sánchez (2005, 2007). See the end of §5.9. 

239See e.g. O’ Neill (1983), Lemma 14.29 to Corollary 14.32. Claim 3 requires a version of the limit curve lemma. 

>40Tf the given T is not complete, then T /||T ||} is complete, where h is a complete Riemannian metric as usual. 

241 See for example Geroch (1970), Penrose (1972), chapters 6 and 7, Hawking & Ellis (1973), chapter 6, O’Neill 
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For the implication 1 — 2 we note that Ly is continuous and hence Weierstrass’s theorem 
guarantees that Lp assumes a maximum value M on the compact set C(x,y). Any Kyy > M then 
satisfies (5.167). Conversely, by the Arzela—Ascoli theorem C(x,y) is compact iff: 


1. Each set {c(t) |c € C(x,y)} C M, where t € (0,1), is bounded; 


2. The family of curves C(x,y) is equicontinuous, i.e., for each t € [0,1 | and each € > 0 there 
is 6 > 0 such that if |s — t| < 6, then d(c(s),c(t)) < € forall c € C(x, y). 


Eq. (5.167) implies both conditions: indeed, if this is the case, then the inequalities 
dy,(x,c(t)) < Ly(c) < Kyy (5.168) 


make the set {c(t) | c € C(x,y)} in clause 1 of the Arzelä-Ascoli theorem bounded. Assuming 
for now c is parametrized by h-arc length, we have Z„(c(s,t)) = La(c)|s— t|, and hence 


d,(c(s),c(t)) < L,(c)|s—t| < Kyy, (5.169) 


which proves equicontinuity. Hence C(x,y) is compact and we are ready with 1 > 2. 

To prove that 1 or 2 is equivalent to global hyperbolicity as in Definition 5.27, first note 
that if C(x, y) is compact, then so is J(x,y). This follows from the continuity of the evaluation 
map. The inequality (5.167) forces non-imprisonment by contradiction: if K C M is compact 
and contains an inextendible fd continuous causal curve c, then this curve is also contained in a 
double cone J(x,y), just proven compact, to which Lemma 5.22 and (5.167) apply.” 

Conversely, the implication from Definition 5.27 to (5.167), immediately follows by taking 
K = J(x,y) in Lemma 5.31. Thus Definition 5.27 and properties 1 and 2 in Theorem 5.34 are 
closely related and easily transferable into each other; the only technical tool was Arzelä-Ascoli. 

Property 3 is quite different, and the proof of equivalence uses a whole new arsenal of 
techniques, each of which is also of independent interest and has many other applications in GR. 

First, the analysis of Cauchy surfaces in property 3 involves the following concept:**° 


Definition 5.35 Let S C M be an achronal susbet of M. 


1. The domain of dependence or future Cauchy development D* (S) of S is the set of all 
x E€ M for which every past-inextendible pd causal curve starting from x intersects S. 


2. The domain of influencedomain of influence or past Cauchy development D- (S) of S 
is the set of all x € M for which every future-inextendible fd causal curve starting from x 
intersects S. 


3. The total domain of dependence or two-sided Cauchy development of S is 


D(S) := D+ (S) UDT (S). (5.170) 


(1983), chapter 14, Beem, Ehrlich, & Easley (1996), chapter 3, Kriele (1999), chapter 8, Choquet-Bruhat (2009), 
chapter XII, Chruściel (2011), and Minguzzi (2019), chapter 3. 

#2 ]f c: [0,00) — K is the curve in question, then take x = c(0) and y € N;>oc(t,œ), which is nonempty and lies 
in K by Minguzzi (2019), Proposition 2.72. The curve then lies in C(x,y). 

243 Definition 5.35 makes sense for any S C M but is only used when S is achronal, i.e. if no two points of S can 
be connected by a timelike curve; see (5.145). Such surfaces S carry initial data for hyperbolic PDEs and the idea is 
that in relativistic physics everything happening at x € D+ (S) is determined by the state of affairs at S. This is not 
really a theorem of mathematical physics, but it is a principle that is backed by the theory of hyperbolic PDEs. See 
Courant & Hilbert (1962), Choquet-Bruhat (2009), Bar, Ginoux, and Pfaffle (2007), and Earman (1995, 2007). 
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By definition, one then has the (not necessarily strict) inclusions 


SED (Ser. (5.171) 


A simple case is S = {x | x? = 0} in Minkowski space-time, which gives D+ (S) = {x | x? > 0}. 


Here are four more instructive examples, all due to Penrose himself.7““ 


Examples of domains of dependence and influence, taking place in 2d Minkowski space-time (Mo, 1). 


e In the figure on the left, S is a generic closed, achronal, and bounded (and hence compact) 
(hyper)surface. The domains D*(S) are compact, too, and so is, of course, their union D(S). 


e In the figure on the right, S is the closed and achronal, but unbounded “southern” hyperboloid 


S=H} := {(t,x) E R? |t=—-V x2 +1}, (5.172) 
cf. (4.88), which asymptotes towards the past lightcone t = — |x|. For the domains D*(S) we find 
D? (S) ={@x) E R? | =V22+1 <t < kl} (5.173) 
D (S) ={(t,x) E R? |t < —V22 +1}. (5.174) 
Note that D? (S) is not closed, since lightlike (hence causal) curves on t = —|x| do not meet S. 
D*(S) i 
nape S 


e In the figure on the left, a point has been removed from S, which has a drastic effect on D* (S). 


e In the figure on the right, a point has been removed from Mg, with a similar effect on D+ (S). 


24 Adapted from Penrose (1972), pp. 39-40, Fig. 31-34, redrawn by Edith de Jong. Note that Penrose defines the 


domains D~(S) using timelike curves instead of causal curves. If we write these as Dp (S), then for closed achronal 
sets S one has Dp (S) = D+ (S) (Minguzzi, 2019, Proposition 3.10), so that D> (S), unlike D* (S), is closed. 
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In any dimension d > 2, the hyperboloid example (5.172), adding the “north”, becomes: 


$= 4H} = (zer = +p); (5.175) 
D*(+H3) = J* (+H?) = fx E RÍ | +x > 4/|¥|2 + i}; (5.176) 
D* (FH?) = J* (FH?) OIF (0) = fx E RÍ | —[x| < x° < y [x2 + i} (5.177) 


Following Hawking, we now define the future/past Cauchy horizons He I (S) of S by 


H (S) := Dt (S)\I (Dt (S)) = {x € D+ (S) | I* (x) 0 Dt (S) = O}; (5.178) 
HE (S) := D- STD) = {x € D- (S) | I (x) AD- (S) = 0}; (5.179) 
Hc(S) := H (S) U Hç (S). (5.180) 


That is, H¢ (S) consists of all points x € D+ (S) that precede no other point in Dt (S), etc.**° 
As any point beyond He (S) can be influenced by events outside S (etc.), the Cauchy horizons 
He (S ) measure the failure of $ to be a Cauchy surface, cf. Proposition 5.38 below. But first, we 
simplify eqs. (5.178) - (5.179) under further assumptions on S (beyond it being achronal). 


Definition 5.36 1. S C M is acausal if there is no causal curve that starts and ends at S. 


2. The edge of an achronal set S consists of all x € M for which every nbhd U of x contains 
points y and z and two timelike curves from y to z, of which just one intersects S.” 


3. A wannabe Cauchy surface is an acausal edgeless (and hence closed) subset of M.” 


Wannabe Cauchy surfaces are a “second best” in the absence of Cauchy surfaces (i.e. of global 
hyperbolicity).”** A sufficient condition for their existence is the existence of a time function 
(see §5.9).°*? Simple examples (that are not Cauchy surfaces) are the hyperboloids in (IM, n): 


S = +H}; He (S) =: HE (S) = 3r" (0), (5.181) 


and the x-axis in the Quinten space-time (M}, 2), see $10.7. Cauchy horizons of wannabe 
Cauchy surface in black hole space-times provide important causal information about their 
interiors (see chapter 9). In the PDE approach to GR they arise when some MGHD is extendible, 
see §10.5, and a Cauchy surface for the MGHD turns into a partial one for the extension. 


245Tn 2d Minkowski space, take S$ = [—1,1] x {0}. Then D*(S) consists of the triangle with vertices (—1,0), 
(1,0), and (0,1), whose two upper sides comprise H¢ (S). Removing (0,0) from S removes the double cone with 
vertices (0,0), (—5,5), (0,1), and (5,5) from D+ (S), whereas HL (S) now consists of two zig-zag teeth (draw!). 

246 See Penrose (1972), 85.6; the definition of an edge in Hawking & Ellis (1973), p. 202 or Minguzzi (2019), 
§2.18 is equivalent provided S is closed. By Corollary 2.142 in Minguzzi (2019), a closed acausal or achronal subset 
S C M (think of a spacelike hypersurface) is edgeless (i.e. a wannabe Cauchy surface) iff SUJ* (S) UI” (S), the set 
of all points through which an inextendible timelike curve exists that intersects S, is open. For a Cauchy surface this 
set equals M, see (5.183), and is a maximal achronal set (Minguzzi, 2019, Proposition 3.37). 

47 Clearly, S\S C edge(S) CS so that if S is edgeless in the sense that edge(S) = Ø, then S is closed, as claimed. 
These are usually called partial Cauchy surfaces, but this is bad since some part of a Cauchy surface is not a partial 
Cauchy surface because it has an edge. Our terminology has the disadvantage that a Cauchy surface is also a 
wannabe Cauchy surface, but since some wannabes actually make it (e.g. Mick Jagger), this is the lesser evil. 

48By Theorem 2.146 in Minguzzi (2019), edgelessness is necessary for maximality of an achronal set. 

249 See Theorems 3.39 and 4.100 in Minguzzi (2019). This is a far weaker condition than global hyperbolicity! 
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Without proof we now collect some key properties of all these sets:”> 


Lemma 5.37 1. Edgeless achronal subsets are closed topological hypersurfaces in M. 
2. Future/past Cauchy horizons He (S) of closed sets S are closed and achronal. 
3. If S is closed and achronal, then ID*(S) = H? (S) US. 
4. If S is closed and achronal, then edge(H*(S)) = edge(S). 
5. If S is closed and acausal, then He (S) AS = edge(S). 
6. If S is a wannabe Cauchy surface, then D*(S)\S is open and HÈ (S) AS = 0, so that 


H (S) oD (S)\s; Hc(S) = dD(S). (5.182) 
We also have the following characterization of “true” Cauchy surfaces among the wannabes: 


Proposition 5.38 A wannabe Cauchy surface (or more generally a closed acausal set) S C M is 
a Cauchy surface iff one (and hence all) of the following equivalent conditions are satisfied: 


1. D(S) = M, or equivalently D*(S) = J? (S); 
2. Hc(S) = 9, or equivalently HE (QS Hg (S) = 0; 


3. Every inextendible curve of fixed causality class C intersects S exactly once, where C may 
(equivalently) be taken to be timelike, causal, or lightlike. 


In particular, a Cauchy surface has empty Cauchy horizon, and yields M as a disjoint union 
Ve SiS lie (S). (5.183) 


The implication 1 = 2 is trivial for the first members (which imply the second). Conversely, if 
Hc(S) = 0 then D(S) must be closed, but by Lemma 5.37.6 it is also open. Since M is connected, 
D(S) = M. Furthermore, 1 & 3 (causal case) is almost true by definition of D(S) and of a 
Cauchy surface. The equivalences within 3 are quite technical, and we omit the proofs.”>! 


250No, 1 is Lemma 3.17 in Penrose (1972) or O’Neill (1983), Proposition 14.25 and Corollary 14.26. A topological 
hypersurface is defined as in the second part of Definition 4.13, assuming all maps to be continuous. No. 2 is 
Proposition 3.15 in Minguzzi (2019). No. 3 is eq. (3.2) in Minguzzi (2019). No. 4 is Proposition 3.22 in Minguzzi 
(2019). No. 5 is (3.6) in Minguzzi (2019). No. 6 follows from 3 and 5, cf. Corollary 3.26 in Minguzzi (2019). 

251 See O’ Neill (1983), Lemma 14.29 for the implication timelike = causal (the converse is trivial). See also 
Ringström (2009), $10.2.7, who proves | from 3 (timelike case), but this includes a proof of timelike = causal in 3. 
Given 1 < 3 (timelike case), Minguzzi (2019), Theorem 3.40, proves lightlike = timelike (whose converse follows 
from timelike = causal > lightlike). Note that in Lemma 14.29 O’ Neill cannot exclude the case where a causal 
curve hits S more than once, but in Proposition 5.38 we assume S is acausal, which excludes this by definition. 
This time assuming that S is acausal, O’ Neill (1983), Corollary 14.54, also directly (i.e. without assuming 1 = 3, 
timelike case) proves the implication lightlike => timelike. Note that both Hawking & Ellis (1973) and Minguzzi 
(2019) define Cauchy surfaces as closed acausal sets S for which D(S) = M, whereas Geroch (1970) and Penrose 
(1972) define them as achronal sets S for which D(S) = M, which is equivalent to Definition 5.32. Take t = 1 for 
x< -—1,t =x for —1 <x <0, and t = 0 for x > 0, in 2d Minkowski space-time. This achronal set defines a Cauchy 
surface for Geroch and Penrose but not for Hawking & Ellis and Minguzzi. However, in view of Theorem 5.44 we 
may always take S to be spacelike, and if we do, then by Lemma 14.42 in O’ Neill (1983) it is automatically acausal. 
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5.9 Time functions 


Using a new technique, in this section we finish our sketch of the proof of Theorem 5.34. First, 
we state a result that is important for both “wannabe” and “genuine” Cauchy surfaces: 


Proposition 5.39 Jf S CM is achronal, the interior int(D(S)) of D(S) is globally hyperbolic. 


Here global hyperbolicity is meant as in Definition 5.27. We will prove this proposition shortly, 
but already note that it implies that a space-time with a Cauchy surface S is globally hyperbolic, 
cf. Theorem 5.34.3. For given such an S, we have int(D(S)) = int(M) = M by Propositions 
5.38.1. The converse inference from global hyperbolicity a la Definition 5.27 to a Cauchy surface 
uses completely different arguments, which will be given later in this section. 

We now sketch a proof of Proposition 5.39, based on criterion 2 in Theorem 5.34.7>* If 
necessary moving the Cauchy surface S in int(D(S)), as in the comments below (5.166), we 
may place x € D~(S) and y € D? (S) (this move is not strictly necessary for the argument).”> 
There are two ways for the uniform bound (5.167) to fail. One is the possibility of closed causal 
loops (or more generally imprisoned inextendible curves), but these would cross S many times 
and this is excluded because S is achronal, cf. Theorem 5.33.1. Secondly, J(x,y) may not be 
compact-not because it is unbounded but because it is not closed because of missing some points. 
To understand the link with (5.167), we recall the assumption that the auxiliary Riemannian 
metric h measuring arc length be complete in Theorem 5.34.2. This implies that h-geodesics 
must avoid such missing points (for otherwise they would be incomplete), and they do so by 
increasing h-arc length near the missing points. Take, for example, Minkowski space-time with 
the origin removed. The Euclidean metric 6 is incomplete on R*\ {0}, but the metric 


h(x) = 6/||x||*, (5.184) 


where ||- || is the Euclidean norm, is complete.”°* Thus the h-arc length of a curve increases 
arbitrarily as it approaches the origin, and hence only infinitely long curves approach the origin. 
This behaviour near missing points is generic. Consequently, if the bound (5.167) is violated, 
there must be causal curves from x to y coming arbitrarily close to missing points in J(x,y) and 
then by Lemma 5.22 there will also be inextendible causal curves from either x or y to these 
points, either of which does not cross S, contradicting S being a Cauchy surface in int(D(S)). 

This argument can be made rigorous by quoting the following generalization of Lemma 5.26:”° 


Lemma 5.40 Let (cn : [0, bn] + M) be a sequence of fd continuous causal curves from x to y 4 x 
parametrized by h-arc length, i.e. c„(0) = x and c„(b„) = y. There are two possibilities: 


e Either bn > b < ©, in which case there exist a fd continuous causal curve 
c:[0,b] >M, (5.185) 


and a subsequence of (cn) that converges to c (in the sense of Definition 5.23); 


252The “official” proof in Hawking & Ellis (1973), Proposition 6.6.3, or O’Neill (1983), Theorem 14.38, is very 
hard to understand, though apparently uncontroversial. A much clearer version of it is given by Chrusciel (2011), 
Theorem 2.9.9, but the argument is still very involved. We therefore take a somewhat different route. 

253 One may instead use Lemma 6.6.4 in Hawking & Ellis (1973), whose proof is very clear: if x € DY (S)\HE (S), 
then every past-inextendible causal curve through x intersects /” (S), and likewise for future-inextendible causal 
curves, so that every inextendible curve through x € int(D(S)) intersects both /* (S) and I~ (S). 

254Continuing footnote 215: For (5.184), where h = 6, we have r(x) = ||x|| and @(x) = 1/r(x) does the job. 

255See Minguzzi (2019), Theorem 2.53, whose case (ii) was excluded in Lemma 5.26 by global hyperbolicity. 


Time functions 


119 


e Orb, > », in which case there exist a fd future intextendible continuous causal curve 
c:[0,©) >M, (5.186) 


with c(0) = x and a subsequence of (cn) that converges (uniformly) to c, as well as 
a pd past intextendible continuous causal curve d : (—»,0] + M with d(0) = y, and 
subsequences (Cn,) of (cn) and (bn,) of (bn) such that dx(t) := cn, (t + bn,) > d. 


In the second case, the limit curve c starting at x somehow fails to reach y (acquiring infinite 
h-arc length by wandering around), whereas d, starting at y and “moving back in time”, similarly 
fails to reach x. This is the situation mentioned in the heuristic part of the proof: at least one of 
these curves fails to reach S and hence S could not be a Cauchy surface in int(D(S)). 


The converse implication from Definition 5.27 to the existence of a Cauchy surface is very 
different and is based on the construction of a time function: 


Definition 5.41 A time function t : M > R is a continuous surjection that strictly increases 
along any fd continuous causal curve. 


We now show that global hyperbolicity implies the existence of time functions,’>° having further 
properties guaranteeing that each level set 


Ze eV |t(x)=t} (5.187) 
is a Cauchy surface. Thus we do not get one & but a whole family (%,), which foliates M by 


To construct t, we once again take a complete Riemannian metric h on M, as well as some at 
most countable open cover (V„) with precompact elements (i.e. V, is compact for each n), so 
that M = UnV;, with some associated partition of unity (@,) subordinate to the cover.’ 
We then turn the standard Riemannian measure u, induced by h,”°® into a probability measure 
Vh = XUn, Where the function X : M > Ris defined by x = Y,,2-"@,,/ Iv, din Gn. Without 
any assumption on a space-time (M, g), this measure is: i) finite (i.e. v„(A) < œ% for any Borel 
measurable A C M); ii) open, in that v,(U) > 0 for any open set U C M; iii) regular (in the usual 
sense of measure theory);”” and iv) assigns zero measure to the achronal boundaries 0/* (x). 
Any measure with these properties can be used in the following construction. Define 


VF:M>R'; t:M—-R; (5.189) 


VG) =v): ia (E) | (5.190) 


256The existence of a time function is equivalent to the weaker assumption of stable causality, which means that a 
space-time (M,g) has a Lorentzian metric g’ such that (M,g’) is causal and g(X,X) < 0 implies g’(X,X) <0. See 
e.g. Minguzzi & Sanchez (2008), Definition 3.52 and Theorem 3.56. Global hyperbolicity yields (5.191) below. 

257 This means that @, € CZ (Vn) and Xn n(x) = 1 for all x € M. 

258 This measure is defined intrinsically, but in coordinates we have dun (x) = \/deth(x)dx° ---dx?. See $7.1. 

259This means that for any Borel set A C M one has outer regularity v} (A) = inf{v,(U) | U D A,U C M open} 
as well as inner regularity v} (A) = sup{v,(K) | K C A,K C A,K compact}. This follows from the fact that vp is 
equivalent to Lebesque measure in any local chart, which also implies the last property, given that the achronal 
boundaries 0/~* (x) have dimension 3 and v; is supported in dimension 4. 


120 


Geodesics and causal structure 


Now very simple examples (e.g. Quinten space-time) show that V~ and V~ may easily be dis- 
continuous, but compactness of all double cones is sufficient to make both functions continuous; 
in fact, for any sequence xn — x one then has 1 j+(,,.) + 1y=(x) a.e.700 

Furthermore, any of the assumptions of causality, non-imprisonment, or strong causality 
(which are equivalent when all double cones are compact, see below) suffice to prove that:”°! 


1. V` strictly increases and V * strictly decreases along fd causal curves. This is natural, as 
moving forward in time leaves more causal past behind and anticipates less causal future. 


2. Along any inextendible causal curve c : R— M (parametrized by h-arc length) one has 


lim V* (c(t)) = lim V- (e(t)) =0; (5.191) 
‚im t(c(f)) = te. (5.192) 


See Lemma 5.22 for the domain R. Eq. (5.191), which implies (5.192), is also quite intuitive, 
for if there had been any causal future J* (x) left beyond the end of the curve, then c(-) could 
have been extended into it and hence would not have been future inextendible. Similarly, no 
causal past is left before the beginning of the curve.”°” This implies that t as defined in (5.190) 
has the right properties to serve as a time function, which strictly monotonically increases from 
—eco to +% along any fd causal curve parametrized by h-arc length. This also means that any 
such curve hits each set %, as defined by (5.187) once, which makes all &, Cauchy surfaces. 


This finishes the sketch of the proof of Theorem 5.34, but there is much more to say about 
the potential smoothness of the Cauchy surfaces %, as well as of the time function t. We first 
sharpen the definition of a time function (see Definition 5.41 in the following way: 


Definition 5.42 A temporal function is a smooth surjection t: M > R with timelike past- 
directed gradient Vt = (dt), or, in coordinates, Vt = g!” dyt. 


Temporal functions are time functions, for if c : I — M is fd timelike, then ¢(t) = g(Vt,c) is 
strictly positive along rol Peat If t is a temporal function, then the level set %, as defined in 
(5.187) is spacelike, since for any x € %, and X € TyM we have g,(Vt,X) = Xt(x), which by 
(5.187) vanishes for any X € Ty}. This forces g,(X,X) > 0 by the following lemma. 


Lemma 5.43 For any Lorentzian metric g, if g(T,T ) <0 and g(T,X) = 0, then g(X,X) >0. 


Proof. Taking an orthonormal basis reduces this to the Minkowski case. Let T = (To,T) 
and X = (Xo,X). Then T? > ||T'||? and ToXo = T -X, so that |ToXo] < ||T'||||X||. This implies 
Xê < ||X||?, since X? > ||X||? gives a contradiction. For example, if Tọ > 0 and Xo > 0 the 
assumptions give Ty > ||T'|| and ToXo < ||7'||||X ||, which contradict Xo > ||X||. The other three 
cases (i.e. 79 > 0 and Xo < 0, To < 0 and Xp > 0, and Tọ < 0 and Xp < 0) are similar. 


260See e.g. Chruściel (2011), §2.11 for very clear proofs of this and the following properties. 

26! The stronger property of global hyperbolicity is needed to prove (5.191), which implies (5.192). See Minguzzi 
& Sanchez (2008) as well as the papers by the latter and Bernal cited in footnote 238. 

262 To illustrate what happens at a more technical level, let us show that x +> v,(V~ (x)) is strictly increasing along 
fd causal curves c. Take x and y € J+ (x) on c; then y ¢ J~ (x) by causality. Global hyperbolicity also guarantees 
that J~ (x) is closed, so that its complement in M is open and hence y has an open nbhd U disjoint from J” (x). But 
J- (x) CJ (y) and vy(U NJ (y)) = va(U NI (y)) > 0, by the properties of vp, so that v,(J~ (y)) > va (J7 (x)). 
263Here ¢(t) : C(T) — M applies the tangent vector € to the function t, not te be confused with €or) € ToM. 


Time functions 


121 


We then have the following result, which smoothens out earlier topological properties: 


Theorem 5.44 A space-time (M,g) is globally hyperbolic iff it has a smooth spacelike Cauchy 
surface. In that case: 


1. There exists a smooth temporal function t : M — R such that M is foliated as in (5.188), 
where each X; is a smooth spacelike Cauchy surface and all %, are diffeomorphic. 


2. M is diffeomorphic to Rx &, where % is diffeomorphic to X, for any t E€ R. 
3. (M,g) is isometric to (IR x &, g’), where the metric g' is given in the 3 +1 form 
g = —Ldt’ + %, (5.193) 


in which L : R x £ — (0,0) is the (smooth) lapse function and & is a (possibly time- 
dependent) Riemannian metric on the Cauchy surface X. 


This landmark theorem is due Bernal and Sänchez.”°* The proof, which is very technical, 
constructs a temporal function t, which gives a time orientation on M via the vector field 


T=-Vt. (5.194) 


In Minkowski space-time, with t = x9, this would be T = d,, whence the minus sign in T. This 
implies claim 2, for in the construction of the map IR x & — M sketched after (5.166) one can 
take T proportional to Vt. The remainder of the proof requires machinery beyond our scope. 
Definition 5.27 and Theorem 5.34 state the various definitions of global hyperbolicity as they 
have been used for about the first 50 years of Lorentzian causality theory, except that in Definition 
5.27 strong causality has traditionally been used instead of non-imprisonment (and even the 
still weaker causality property could have been used).*°° However, given compactness of the 
double cones, if dim(M) > 3 it turns out to be sufficient for global hyperbolicity to require the 
extremely weak condition that (M,g) be non-totally vicious, which means that there need just be 
a single point through which no closed timelike curve passes.” If all J*(x) are closed (which, 
as Lemma 5.29 shows, is a consequence of compactness of the double cones), this implies that 
(M, g) is strongly causal (and hence non-imprisoning). Moreover, if (M,g) is totally vicious 
(i.e. there is a closed causal curve through each point), then J+ (x) = M for each x. If it is also 
required that all J(x,y) are compact, this forces M to be compact. Hence if M is non-compact 
and all double cones are compact, then (M, g) cannot be totally vicious. In conclusion, under 
physically reasonable assumptions global hyperbolicity has reached a very simple form:7°’ 


Proposition 5.45 Let (M,g) be a space-time with dim(M) > 3 and M non-compact. Then 
(M,g) is globally hyperbolic iff all double cones J* (x) OJ- (y) are compact. 


264See references in footnote 238. Apart from these, see also Ringström (2009), chapter 11. Other constructions 
of temporal functions were given by Fathi & Siconolfi (2012) and Chrusciel, Grant, & Minguzzi (2016). 

65 See Bernal & Sánchez (2006) and Minguzzi (2019). 

266 And hence a space-time is totally vicious if some closed timelike curve passes through every point. 

°67For a complete proof see Hounnonkpe & Minguzzi (2019), which relies on results by Clarke & Joshi (1988). 
Briefly, the latter proved the inference from non-total viciousness to chronology (assuming an even weaker property 
than closedness of all sets J* (x), namely that (M, g) is reflecting, which means that x € J- (y) iff y € J+ (x)), which 
the former then strengthened to strong causality. This simplification is quite remarkable, after 50 years! 
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5.10 Global hyperbolicity: AdS as a counterexample 


After all this abstraction, some specific examples may be welcome. Minkowski space-time 
(IM, n) is globally hyperbolic as well as geodesically complete, and the removal of any point 
from it ruins both properties in a somewhat trivial way. Much more interesting examples arise 
from the two combinations geodesically complete but non-globally hyperbolic and the opposite, 
i.e. geodesically incomplete but globally hyperbolic. The latter will be the subject of chapters 
6 and 9. A nice example of the former is anti de Sitter space-time AdS'5, defined for any 
n= dim (AdS° ) > 2 and p > 0, see §4.4. To make our point, the case n = 2, p = 1 suffices, for 
which we simply write AdS = AdS?; the conclusions will be true for any n > 2 and p > 0. 

Eq. (4.93) gives AdS as the set of all (x_1,x0,x1) € R? such that x? | +x = xf + 1, with 
metric induced from n’ = diag(—1, —1,+1). This space is homeomorphic to St x IR and contains 
closed timelike curves, such as t +> (cost, sint,0) defined ont € [0,27]. For this reason alone it 
cannot be globally hyperbolic, but its Lorentzian cover AdS, where these curves are defined for 
allt € R and no longer close, isn’t either, as it fails on compactness of double cones J(x,y). 


Some lightlike geodesics in 2d anti de Sitter space-time. The x-axis is horizontal, the T-axis 
is vertical. The blue lines are the two lightlike geodesics through the origin x = (0,0), cf. 
the corresponding cross in 2d Minkowski space-time Mp. These blue curves asymptote to 
+ $7 as |x| > œ. Similarly, the red lines are the lightlike geodesics through y = (0,4). The 


set J? (x) is the area above the two upper blue lines (including boundary) whilst J~ (y) is 
the area below the two lower red lines (idem). The double cone J+ (x) O J7 (y) stretches on 
forever to the left and to the right and hence is not compact, violating global hyperbolicity. 


To see this,0® 


introduce local coordinates (t, %) € (—2,7) x R initially on AdS by 

x_] = cos Tcosh %; xo = sin Tcosh x; xı = sinh %. (5.195) 
In these coordinates, the metric on AdS is simply given by 
ds? = — cosh? y dt? + dy’, (5.196) 


and therefore AdS can be globally coordinatized by (t, 7) € RÈ, with the same metric (5.196). 


268 Let R? have any (semi) Riemannian metric g’. Take a surface £ = F (U) C R? defined by a smooth injective 


function F : U — R? satisfying the conditions stated at the beginning of $4.3, where U C IR? is open. Then the 
1 OF (u! u?) ƏFİ (u! u?) 


induced metric g on & is given by guy (u!,u?) = Yj j=1,23 Ed EMS 


‚where u,v = 1,2. 


Global hyperbolicity: AdS as a counterexample 


Taking X(t) = t, lightlike (pre)geodesics in AdS are solutions of tı(t) = +1/coshr. Thus 


Ti(X) = +2arctan(tanh(+7)), (5.197) 


gives the lightlike (pre) geodesics through (0,0) are whilst those through (0,4) are the same, 
moved up by (0,4). In the (t, 7) plane these give the blue and red curves, respectively. 


Timelike geodesics in AdS are also quite remarkable, as the following picture shows. 
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Some timelike geodesics yc through the origin in AdS. The inner blue one has C = 0.1, the 
black one has C = 0.5, then next blue one has C = 1, and the outer red curve has C =2. 
For C = 0 the geodesic is simply the T-axis; the geodesics corresponding to negative values 
of c are the mirror images (in the T-axis) of those displayed. All geodesics spiral around the 
t-axis and continue to focus and defocus. Mind the difference in scale between the axes! 


The geodesic equations for the metric (5.196) are easily found to be 
t+2tanhy 7X = 0; (5.198) 
X +coshxsinhx -t* = 0, (5.199) 
and one can explicitly find all timelike geodesics through the origin, namely 
Xe(t) = arcsinh(C - sint); (5.200) 
t 
w(t) = V1 tef ds(1+C?sin? s)-!, (5.201) 
0 
where CE R is a constant,” physically interpreted via 7(0) = C and t(0) = V1 +C2. These 
geodesics Yc(t) = (Xc(t),Tc(t)) are all timelike, with g(¥c(t), ¥c(t)) = -1forallCandreR. 
Combining the two plots gives another way to see that AdS is not globally hyperbolic, because 


vast areas in J* (0,0) are inaccessible by timelike curves from the origin, eternally attracted to 
the T-axis as these apparently are. Thus global hyperbolicity would contradict Theorem 5.30. 


29 For 0 <t < 4m one has t(t) = arctan( V1 + c2- tant , but the formula in the main text is more useful. 
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6 The singularity theorems of Hawking and Penrose 


We now turn to the famous singularity theorems of Hawking and Penrose.”’ The possibility of 
singular space-times in GR was suggested by three of the earliest exact solutions to Einstein’s 
equations, namely the Schwarzschild solution from 1916, de de Sitter universe from 1917 (here 
given in Eddington’s coordinates from 1922, see §9.1), and the Friedman-Lemaitre-Robertson- 
Walker (FLRW) solution first found (by Friedman) in 1922. These are given by 


—1 

aġ=- (1-2) dt? + (1-7) dr +r°dQ; (6.1) 
r r 
2 a= 

A, = — (: = z) dt? 4 (1 = z) dr? +r°dO; (6.2) 
P P 

ds} = -dt? + a(t)? (dx? + f (x)*dQ), (6.3) 


respectively, where ds? is just the original notation for the metric, and dQ is defined in (4.72). 


e In the vacuum Schwarzschild solution (6.1), m > 0 is the mass of some gravitating object 
and the space-time is Ms = Rx %, where at least initially, in polar coordinates (r, 0, @), 
the spatial part £ C R? is restricted to r > 2m. Here the value r = 2m looks threatening, 
as does r = 0 (although the latter is not, as yet, in the domain of the solution).””! 


e The de Sitter metric (6.2), initially defined as above but now for 0 < r < p, requires a 
cosmological constant A = 3/p?, where p is the radius of the visible universe. The 
potential danger lies at r = p, which looks as bad as r = 2m in the Schwarzschild metric. 


e The FLRW solution (6.3) requires matter. The space-time is M = (0, 02) x 2, where: 


— Z = S? (the 3-sphere) and f(x) = sin y for k = 1 (positive curvature); 
- © = R? and f(x) = x for k = 0 (zero curvature), 
— X = H? (the 3d hyperboloid) and f(x) = sinh y for k = —1 (negative curvature). 


The function a(t) depends on the precise matter content of the universe. For example, for 
a dust-filled spatially flat universe one has a(t) ~ t?/3 as t — 0, where also the Ricci scalar 
R blows up. The precise form of R(t) again depends on the matter content, but in the same 
dust-filled case one finds R(t) ~ t7?. Similarly for other forms of matter. See also $8.3. 


Even Hilbert and Einstein were initially confused about the meaning of these apparent or real 
singularities, but today it is clear that r = 2m and r = p are just singularities of the coordinate 
systems in which the Schwarzschild and de Sitter solution are expressed. This is not to say 
that nothing interesting happens at r = 2m: The hypersurface r = 2m is an event horizon, see 
chapters 9 and 10, and even the utterly regular (even constant-curvature) de Sitter space-time has 
a kind of horizon at r = p, see §9.1. On the other hand, in the Schwarzschild solution r = 0, if it 
were part of space-time, would be a real singularity of the metric and its associated tensors.”’? 


270 Israel (1987), Tipler, Clarke & Ellis (1980), Clarke (1993), Earman (1995, 1999), Earman & Eisenstaedt 
(1999), Senovilla (1998), Joshi (2014), Senovilla & Garfinkle (2015), and Curiel (2019) survey singularities and 
singularity theorems in GR, including history. The classical textbook exposition remains Hawking & Ellis (1973). 

271 We will study this solution in detail in §9.2. Mathematically, it also makes sense for m < 0, see §9.5. 

272The singularity is detected by the Kretschmann scalar RP CHY Roguv, Which goes like r-° as r —> 0, see (9.18). 
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In his (very brief) published reply to de Sitter and his universe,”’* Einstein in some sense 
paved the way to the modern notion of a singularity in arguing about de Sitter’s metric (6.2) that: 


1. The singularity in the metric [at r = p] is real since ‘it seems that no choice of coordinates 
can remove this discontinuity’ [immaterially, Einstein said this in different coordinates]. 


2. “Singular’ points [at r = p] can be reached from “regular” points in finite proper time. 


3. The conjunction of | and 2 is a ‘grave argument against the admissibility of this solution’, 
because this conjunction makes r = p ‘a genuine singularity’ (eine echte Singularität’). 


Although claim 1 is simply wrong (de Sitter space is regular in every conceivable way), and 
criterion 2 is stated incorrectly (one should use timelike geodesics instead of the arbitrary timelike 
curves Einstein mentions in his paper),”/* the underlying idea was surely forward-looking!”’> 
Talk about singularities in GR remained ambiguous until the 1960s. In 1963, Misner first 
argues that everything should be described in terms of a regular manifold M carrying a regular 
metric g, so that singularities cannot be “in” space-time (this marks a decisive difference with say 
singularities in for example the electro-magnetic field).”’° Attempting to capture what it could 
mean for a Lorentzian manifold (M, g) to be “singular”, Misner then requires the implications 


curvature singularity = (M,g)is “singular” = (M,g) is geodesically incomplete (*) 


where ‘curvature singularity’ means unbounded curvature along a (semi) open geodesic segment, 
cf. Proposition 6.2 below. He adds that it is ‘commonly accepted’ that an ‘essentially singular 
space’ is not only incomplete, but also inextendible (cf. Definition 6.1). Hence we have both 
necessary and sufficient conditions for a space-time to be singular, but not yet a definition. 
Inspired by Penrose’s paper from 1965 discussed below, it may have been Hawking (1966), $6.1, 
i.e. in his Adams Prize Essay, who, explicitly assuming inextendibility, first proposed to:”’’ 


take timelike and lightlike geodesic incompleteness as our definition of a singularity of 
spacetime. 


In defense of this definition, Hawking and Ellis make both a physical and a pragmatic point: 


“Timelike geodesic completeness has an immediate physical significance in that it presents 
the possibility that there could be freely moving observers or particles whose histories did 
not exist after (or before) a finite interval of proper time. This would appear to be an even 
more objectionable feature than infinite curvature and so it seems appropriate to regard such 
a space as singular. (...) The advantage of taking timelike and/or null incompleteness as 
being indicative of the presence of a singularity is [also] that on this basis one can establish 
a number of theorems about their occurrence.’ (Hawking & Ellis, 1973, p. 258) 


Although this has become the “received definition”, one may side with Geroch (1968), p. 526: 


(a) there is no widely accepted definition of a singularity in general relativity; 


(b) each of the proposed definitions is subject to some inadequacy. 


223 This reply is Einstein (1918d), which is Doc. 5 in Einstein (2002a), translated in Einstein (2002b), Doc. 5. 

274For any timelike curve reaching some point in infinite proper time one can find another timelike curve doing so 
in finite time, which makes Einstein’s way of stating point 2 empty (Clarke, 2003, pp. 2-3). 

27> Binstein (1939) still used this kind of reasoning to explain why the Schwarzschild radius r = 2m was (allegedly) 
inaccessible and hence should not be seen as a singularity. Lack of stubbornness was not his trait! 

276See Misner (1963), who acknowledges L. Shapley and L. Marcus as sources of his discussion. 

27’ In his PhD Thesis, Hawking (1965), §4.1, says, without any further ado (or mention of inextendibility): 
‘any model must have a singularity, that is, it cannot be a geodesically complete C!, piecewise C? manifold.’ 
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Remarkably, in the single most famous paper on singularities in GR, Penrose (1965) does 
not explicitly define what he means by a “singularity”, although all uses of the word refer to 
examples of r = 0 curvature singularities, and the general context of stellar collapse also suggests 
that this is what he means. Nonetheless, what he proves is geodesic incompleteness (see Theorem 
6.15), and since Misner’s implications (*) do not work the other way round, this leaves a gap.”’® 
This gap did not reduce the huge and immediate impact of the paper (later Penrose won half 
of the 2020 Physics Nobel Prize on this basis). Namely, until 1965 it had been quite unclear 
whether singularities (however defined) were generic or exceptional (i.e. typical of very special 
solutions with a high degree of symmetry, and hence absent in realistic solutions).”’” Penrose’s 
work, almost instantly followed by Hawking’s, settled this in favour of genericity. Hence despite 
Geroch’s warning,” we now formalize the “received definition” (cf. Definition 5.18). 


Definition 6.1 I. A space-time (M,g) (cf. Definition 5.3) is extendible if there exists a 
space-time (M',g'), where dim(M’) = dim(M) and M' + M, and an isometric embedding 
i:M<+M' (i.e. i*g' = g) with i(M) open. It is inextendible if this is not the case.**! 


2. A space-time (M, g) is causally incomplete if M contains an incomplete causal geodesic. 


3. A space-time (M,g) is singular if it is both causally incomplete and inextendible in a way 
relevant to its incompleteness, that is, if M contains an incomplete causal geodesic y such 
that there is no extension i: M — M' for which the curve io y is extendible.*** 


A very useful criterion for proving inextendibility of space-times (M, g) is the following:7*° 


Proposition 6.2 If either all causal (or even just all timelike) geodesics in M are complete, or 
for any incomplete causal (ibid.) geodesic y: |0,b) — M in M there is a curvature invariant 
(such as R or RP°#VRooyy, etc.) that is unbounded as t — b, then (M,g) is inextendible. 


2781t is easy to get confused by what Penrose writes: ‘If, as seems justifiable, actual physical singularities in 
space-time are not to be permitted to occur, the conclusion would appear inescapable that inside such a collapsing 
object at least one of the following holds: (...) (c) The space-time manifold is incomplete (...)’. A footnote at this 
point adds that “The “I’m all right, Jack” philosophy with regard to the singularities would be included under this 
heading!’. This suggests that he considers incompleteness a logically possible but cheap way to avoid singularities. 
Just before stating his theorem, he adds that ‘the existence of a singularity can never be inferred, however, without 
an assumption such as completeness of the manifold under consideration’. (Penrose, 1965, p. 58). This looks like 
the very opposite of defining singularities through geodesic incompleteness! But it all makes sense if “complete” is 
taken to mean inextendible, as both Erik Curiel and José Senovilla propose (in e-mails, May 2021). See also $10.4. 

279 For example, Einstein and Landau’s school maintained the latter. Earlier work by Raychaudhuri, Komar, 
Szekeres, Misner, and Shepley towards the singularity theorems is discusses in the references in footnote 270. 

280]n defense, is is hard to make rigorous the idea that in space-time singularities the curvature blows up (Clarke, 
1993; Senovilla, 1998). For example, on the one hand the components of the Riemann and Ricci tensors are 
coordinate-dependent but on the other hand curvature scalars like R, REY Ruy, or RP OHYRyouv do not capture all 
about curvature. Furthermore, since singularities are not actual points of space-time one cannot speak of nbhds of 
such points either. There are also space-times that both common sense and Definition 6.1 deem singular although 
the curvature is bounded. Two examples are the 2d Misner space-time (Misner, 1967; Thorne, 1993; Hawking & 
Ellis, 1973, §5.8), and the flat 4d space-time R*/ ~ whose spatial polar coordinates (r,6,®) are defined just for 
0 <@ <a, where 7 <a < 2r, and (t,r,0,@ =0) ~ (t,r,0,@ =a). This has a conical singularity at r = 0. 

28! This tacitly assumes smoothness of (M’,g’) and i. One may also keep M’ and i smooth but lower the regularity 
of g’. This is important e.g. for cosmic censorship, see $10.5. See also Senovilla (1998), Definition 3.1. 

aa Adapted from Clarke (1993), p. 10, who uses curves instead of geodesics (cf. §10.4). See also Manchak (2014). 

283See Remark 5.45 on page 155 of O’Neill (1983) or Proposition 4.4.3 in Chruściel (2020). The proof is by 
contradiction: if M were extendible, then some such y could be continued past b into M’ with a finite limit as t > b. 
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The singularity theorems of Hawking and Penrose 


6.1 Congruences of geodesics 


For the singularity theorems one needs a variation on conjugate points called focal points, which 
are like conjugate points but now defined relative to a congruence of geodesics rather than a 
single one. In general, a congruence of curves through an open set U CM is simply a family 
of curves such that each point of U lies on exactly one such curve. This is automatic if the 
congruence arises as the flow of a vector field defined on U, and vice versa, a congruence yields 
a vector field as its tangent, so that congruences in U and vector fields in U are interchangeable. 
The following constructions on a Lorentzian manifold may be performed in either the timelike or 
lightlike case. We start with the former; see $6.3 for the latter. Thus we start from a fd timelike 
vector field u € X(U) defined locally on some open U C M, normalized such that, at each x € U, 


li) = uu (x)u” (x) = 1, (6.4) 
The associated congruence, then, is obtained by integrating this vector field. In one example of 
interest, u is the 4-velocity of some (relativistic) fluid moving in the cosmos, but for Hawking’s 
singularity theorem one starts from a spacelike hypersurface & C M (which will eventually, but 
not yet, be taken to be a Cauchy surface), and defines the congruence as consisting of all timelike 
geodesics y emanating from & with initial velocities normal to Ł, for as long as they do not cross 
and do remain timelike. This condition defines the open set U D &; singularities arise (outside 
U) if the geodesics either cross or become lightlike. The right parametrization of each y then 
enforces (6.4), where u = Y; recall that g(Y,y) is constant along y, so that (6.4) persists in t. 
The flow @, of u (which at x € È of course is given by œ (x) = y(t), where y is the geodesic 
emanating from x normal to È and satisfying g(y,7) = —1), then defines &, = @,(X), which is 
diffeomorphic to È as long as œ (x) € U for all x € %, and hence one obtains a disjoint family of 
spacelike hypersurfaces (2,) rez, Where J C R is some open interval.2°* Alternatively, a temporal 
function t : U —> R (cf. Definition 5.42) defines a family of spacelike hypersurfaces 


Ly = {xe TU | t(x) =t}, (6.5) 


to each of which u remains orthogonal. Indeed, if X is a vector field on È, i.e. X € T,X, then 
Q(X) € To, (x) 1 extends X to U. Still calling this extension X, we have £,X = 0 throughout U 
and g,(u,X ) = 0 for x € È by construction. Then 


d 
ee) = Vig(u,X) = g(Vuu,X) + g(u,VuX) = g(u, Vxu) = 1Vxg(u,u) =0, (6.6) 


where we used V,,u = 0 (since each yis a geodesic), torsion-freeness (3.47) of V, which gives 
VX = Vxut [u,X] = Vxu+ 4,X = Vxu, (6.7) 

and finally (6.4). In terms of t, the unit normal u to each È; is then given by 
u = —LVt; (6.8) 
L=1/,/—g(Vt, Vt), (6.9) 


where the function L : U — R} is called the lapse; it will be taken up in §8.1. 


284 This construction fails as soon as some y(t) = Ø (x) reaches a focal point, as discussed in the next section. 
In that case, the map @(x,Y) = exp,(Y) from N*M to M (where N*M is the normal vector bundle over © to the 
embedding % > M, i.e. X € NEM if X € T,M and X L Tx}, where x € %), which is a diffeomorphism from a nbhd 
of the zero section in N*M to a tubular nbhd of © in M, fails to be a diffeomorphism. Note that @(x,tu) = œ (x). 
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A timelike vector field u defines a fair amount of derived tensors, each of some importance: 


a! := VE (acceleration); (6.10) 
huv := Suv + upy (spatial projection); (6.11) 
kuv := yh? V pus (minus extrinsic curvature); (6.12) 
Oy := Kiuy] (vorticity); (6.13) 
Ouv := kuv) — +Ohuv (shear); (6.14) 

0 := g"Ykuy = he kyy = tr(k) (expansion), (6.15) 


where k(yy) := 3(Kuv + kvu), kjuy] = (kuv — kvu ), and hy, := 65 + ul uy. It follows that 


Vuuv = kuv = Uudv; (6.17) 
9 = Vault (6.18) 


Eq. (6.16) is trivial. Eq. (6.17) can be checked by contracting both sides first with u”, then 
with u”, and finally with vectors orthogonal to u. The first contraction merely reproduces the 
definition (6.10). For the second contraction we use (6.4), (3.65), and (3.52) to compute 


0 = dug(u,u) = (Vug)(u,u) +g(Vuu,u) + g(u,Vuu) =0+2g(u,Vuu), (6.19) 


whence u’ V uuy = g(u,V uu) = 0. The third contraction reproduces the definition (6.12). Eq. 
(6.18) follows from (6.17) and (6.15), since again g(a,u) = 0. The interpretation of the ac- 
celeration a = V „u is clear; by definition it vanishes for congruences of geodesics, for which 

kuv = V uuy. (6.20) 


Furthermore, eq. (6.4) implies that h% is the projection onto the orthogonal complement of u, 
since we have hiu” = 0, h XY = X! whenever g(u,X ) = 0, and finally, hy hp, = Ah: 

If u comes from a spacelike hypersurface } as explained above, the tensor huy is a four- 
dimensional “covariant” version of the three-dimensional induced metric in %,, in that 


huv = g(hh phg Ao) = hhg gpo- (6.21) 


Proposition 6.3 Let u be a timelike vector field in U C M as above. Then there exists a spacelike 
surface & C U to which the vectors u are normal iff Oyy = 0 (i.e. kuy = kyu). 


Proof. This follows from the Frobenius theorem, in the following form: the vectors orthogonal 
to u form an integrable distribution (that is, they span the tangent space to a (hyper)surface & 
orthogonal to u) iff they close under the Lie bracket. Here, this means that the condition 


g(u,X) = g(u,Y) =0, (6.22) 
implies g(u, [X,Y]) = 0. By (3.47) this is the same as g(u, VxY) = g(u, VyX), or as 
g(Vxu,Y) = g(Vru,X), (6.23) 


assuming (6.22), since 0 = X(g(u,Y)) = g(Vxu,Y) +g(u,VxY), and similarly with X and Y 
swapped. But (6.23), given (6.22), is equivalent to kuy = kuv and hence to @yy = 0. 
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A good way to look at @ follows from the computation (7.25) in §7.1 below, in which dV 
should be replaced by %, and N by u. Using (local) coordinates Gr) on %;, such that 
ux' = 0 for each i = 1,2,3, the geometric volume form on È; is given by 


o(x) := V(x)dx! Adx* Adx’. (6.24) 
Then (7.25), which is Z,o = 00, comes down to V~!d,V = 0 or 


o lov 
vo 

The terminology used in (6.10) - (6.15) is hybrid: the geometric notion of extrinsic curvature 
only really makes sense if the congruence arises from a hypersurface % in the said way, whereas 
the other terms rather come from fluid mechanics. The vorticity tensor describes the rotation of 
the fluid, the shear (which is traceless) describes the directed volume-preserving expansion (or, 
if negative, the contraction) of the fluid, and finally @ gives the rate of total volume increase (or, 
if negative, the decrease) under the flow. This is shown in the following picture: **° 


8 = ð 1In(V). (6.25) 


Cuv £0 Copy Ow = 9 
go] 6 A= 0 
Tw =0 Op 0 


Left to right: effects of rigid rotation, uniform spherical expansion, and volume-preserving shear. 


We finally derive the fundamental Raychaudhuri equation for 0.”°° Using (4.13), we compute 


u°Vo(Vuuy) = u°(VuVo + [Vo. Vu] )uv 
= Vu (ueVouy) = (Vau? )Vouy + Rypopueu?. (6.26) 


For geodesics the first term vanishes. Eqs. (6.15), (6.20), and (6.16) then yield, along u, 
V0 = Ô = — 10° — Opyo!” + @yyO"Y — Ryyultu’. (6.27) 


This equation acts as a key lemma to Hawking’s singularity theorem, to which we now turn. 


?85Redrawn from Malament (2012), p. 174, by Edith de Jong. Caption also due to Malament. 
286 Amal Kumar Raychaudhuri (1923-2005) was an Indian physicist. The July 2007 issue of Pramana-Journal of 
Physics (Volume 69, no. 1) is devoted to him and his equation. See also Earman (1999) for its historical context. 
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6.2 Hawking’s singularity theorem 


Hawking’s singularity theorem from 1965-1966 remains exemplary because of the clarity of 
its hypotheses, its spectacular conclusion, and the elegance of its proof. Here it is: 


Theorem 6.4 Let (M,g) be a space-time. Assume that:?*! 


1. (M,g) is globally hyperbolic; 
2. One has Ruy ý” ý” = 0 along all timelike geodesics y in M (timelike curvature condition); 


3. The mean extrinsic curvature H of some spacelike Cauchy surface & (defined with respect 
to future directed timelike normals) satisfies H (x) < Ho for some Hy < 0 and all x € È. 


Then (M,g) contains incomplete timelike geodesics: specifically, no past directed timelike 
geodesic y emanating from & can have arc length (i.e. proper time) L(y) > 3/|Ho}. 


Before discussing the detailed meaning of the assumptions made here (including the notation H), 
let us state their generic nature, which is common to all singularity theorems in GR so far:7*° 


1. is a global causality assumption; 


2. is a dynamical assumption on the curvature motivated by a corresponding assumption on 
the energy-momentum tensor Tuy in the Einstein equations (see below); 


3. is a static assumption on the curvature, i.e. a boundary condition imposed at some fixed 
time, typically empirically motivated (in this case, by the expansion of the universe). 


We have amply discussed global hyperbolicity: is it the strongest generic causality assumption. 
Logically speaking, strong assumptions weaken theorems. But since Hawking’s theorem merely 
states that (M,g) is timelike incomplete without giving any indication why that is, global 
hyperbolicity at least strengthens the inference from geodesic incompleteness to (M, g) having 
some “interesting” singularity, see the discussion preceding Lemma 5.29 in §5.7. 

As the proof will show, assumption 2 means that observers moving on timelike geodesics see 
these converge (by tidal forces), so that gravity is attractive. The Einstein equations 


where Tuy is the energy-momentum tensor (discussed in more detail in $7.3), relate assumption 
2 to the matter content of the universe, notably to the so-called Strong Energy Condition (SEC) 


Tar > -4T, (6.29) 


where T = g"YT,v. Writing ý = u, the energy relative to our observer is E = Tyyu"u" and the 
average (spatial) stresses are § = "VRAS Toes where h is defined by (6.11). The inequality 
(6.29) then simply becomes E > —S, which is satisfied by most classical forms of matter.” 


287 We state the physically relevant case: there is a mathematically equivalent version assuming H > 0, in which 
case the future directed timelike geodesics will be incomplete (this describes a big crunch instead of a big bang). 

288 See Senovilla (1998) for a detailed introduction and overview, included the structure laid out here. 

289 See §7.3 for a brief discussion of energy conditions in GR, as well as Curiel (2014a) and Martin-Moruno & 
Visser (2017) for extensive information. As Curiel notes, ‘[SEC] says that ordinary mass-energy density cannot be 
negatively dominated by the sum of the individual pressures (momentum fluxes) at any point, as determined by an 
observer traversing a timelike curve. I know of no compelling elucidation of the physical content of that relation.’ 
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The mean extrinsic curvature of } C M was already introduced in $4.3, see (4.56), though 
in one dimension less and and in the special case where M = IR? with Euclidean metric. The 
general case was taken up in §4.7. We recall that first, the extrinsic curvature of } C M isa 
tensor field k € X0) (Z) initially defined merely on È by 


k(X,Y) := -g(Vxu,Y), (6.30) 


where X,Y € X(%) and u is the fd normal vector field on È discussed in the previous section. 
This definition is predicated on the fact that Vxu is tangent to &, which is an easy consequence 
of the property g(u,u) = —1. Similarly, it is easy to show that k is symmetric, namely: 


k(X,Y) = -g(Vxu,Y) = g(u,VxY) = g(u,VyX) =k(Y,X). (6.31) 


The tensor kyy is just a covariant version of k, in that k(u,u) = k(u,X) = k(u,u) = k(u,X) = 0 
whenever g(u,X) = 0, and k(X,Y) =k(X,Y) if g(u,X) = 0 and g(u,Y) = 0, where k(A,B) = 
kuyA} B". From k, we define the mean (extrinsic) curvature H : £} — R of% as 


2 
AG) Stk) = L kx(ei(x),€i(x)), (6.32) 


where (e;(x)) is any orthogonal basis of T,X, x € È. Since k in (6.12) is spatial (because of the 
projections A in its definition), because of the (conventional) minus sign in (6.30), on & we have 


0 = —H. (6.33) 
Let us recall some Riemannian examples from §4.3, see (4.66), (4.70), and (4.74): 
e For any plane in R? (with flat metric) we have H = 0, and hence 0 = 0. 


e For the cylinder of radius p, i.e., CG c R? (with flat metric), we have 0 = 1/p. 


e For the sphere of radius p, i.e., s? C R? (with flat metric), we have 0 = 2/ p. 


Here the normal vectors used in the definition (6.30) are outward, and we see from these examples 
that negative H, and hence positive 0, gives diverging geodesics normally emanating from &. By 
the same token, negative O gives converging normal geodesics, which in our universe happens in 
the past direction, so we have H > 0 on È in the past direction and hence H < 0 in the future 
direction, as assumed in the theorem (where Y = u is taken to be future directed). The Lorentzian 
example is the FLRW universe (cf. §8.3), where (8.88) gives 0 (t) = 3a(t) /a(t). 

Proof. Given assumption 3, we can work in the setting explained after (6.4) in the previous section. 
We write Y= u and return to the Raychaudhuri equation (6.27). Since Oyy is symmetric and 
spatial, we have oy yo” = Tr(o7) > 0, where o is the matrix with components of = hHP Opv. 
Furthermore, @ = 0 by Proposition 6.3, whilst assumption 2 in Theorem 6.4 gives 


Therefore, the Raychaudhuri equation (6.27) gives Ô + 10? < 0, i.e. 6°! > 4. Assumption 3 
gives 0 > & > 0 on È, where 69 = —Ho, whence 0 < 0! < 65 It then follows that 07! — 0, 
or 0 — co, at some time ts € [305 10), i.e., backward in time, provided that the geodesic in 
question can indeed be extended to t,. By (6.25), this corresponds to V > 0. 
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We now transform the divergence of 0 into a conclusion about conjugate points. Within the 
congruence we take some fixed geodesic y: [-a,0] — M with y(t) = u(t) for all t and y(0) = 
x € È. As in §5.1, define a smooth family of neighbouring geodesics (y) all lying in the same 
congruence, that is, ¥(0) € È and %(t) L u(t) = 0 for all s and t, and define the associated 
Jacobi field J by (5.10). Since y = u, we may then rewrite (5.5) as 

VJ = V ju. (6.35) 
Since dim(%) = 3, the space J” 


Hence it is convenient to introduce a moving frame (e; (t),e2(t),e3(t)) along y(t), that is, an 


orthonormal basis of TM (see Proposition 5.2) for each t; such a frame can be constructed by 


solving V ye; = V,e; = 0 with orthonormal initial conditions at t = 0; this equation guarantees 
that the frame remains orthonormal as well as orthogonal to 7.7”? We may then expand 


J(t) =Ji(t)ei(t) = Duden); (6.36) 


Ji(t) = gya) Yr),eift)). (6.37) 
Furthermore, the covariant derivative V;J = V,J now becomes a time-derivative, since 
(Vier) = Ey (Vad (t) eilt)) = ulay J (4), eilt))) — gyn) (IR), Vuei(t)) =Ji(t), (6.38) 
where Ji(t) = dJ;(t) /dt. Using (6.35) and V ye; = 0, we obtain 


Kt) = 8ye) (Vu (t),e:(t)) = Sy, (Vault), e:(t)) = Jit) aye) (Vju(t),ei(t)) 


= KOO, (6.39) 
where, keeping in mind that kj; = —k; j, cf. the minus sign in (6.30), 
kij(t) = 8y(r)(Vju(t),e(t)) = kuv (V(t) Je? (t)e¥ (t), (6.40) 
are the components of (6.30) in the frame (e;(t)), with k;; = k;ji. Eq. (5.7) then reads 
a = a%;(t)J;(t), (6.41) 
Qij := g(e;,O(u,e;)u) = R(ei,u,ej,u). (6.42) 


(2) 


Conversely, a simple dimension count shows that the Jacobi fields J € Jy“ along y that arise 
from the congruence emanating from È in the said way are those that satisfy the initial condition 
Ji(a) = kij(a)J;(a), (6.43) 

predicated on (6.36) - (6.37), so that (6.43) implicitly also assumes the initial condition 
Ja) ET yah € 8y(q)(Jou) = 0. (6.44) 


Conversely, it follows from the Jacobi equation (6.41) with (6.42) that both initial conditions 
(6.39) and (6.44) are propgated along Y, i.e., hold for all t where y(t) is defined. 
For our proof we now need the following variation on Definition 5.10, backward in time. 


290This simple construction works because y is a geodesic. Along more general curves one needs the so-called 
Fermi derivative Vy ei instead of the covariant derivative Vye;. See e.g. Hawking & Ellis, $4.1. 


C Jy of Jacobi fields arising in this way is 3-dimensional. 
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Definition 6.5 A point x = y(c), where c € [—a,0), is focal relative to y = y(0), seen as a 


member of the congruence of normal geodesics to È, if there is a nonzero Jacobi field J € J), 
i.e. satisfying the Jacobi equation along y with initial condition (6.43), for which J(c) = 0. 
To elucidate this, for any t € [—a,0] we define two “double evaluation maps” A+, B; by 
At,Br : Jy > ToM ® TyM; (6.45) 
A,(J) = (J (0), J(t)); B,(J) = (VJ (0) — kyo) J (0), J(t)). (6.46) 


Both maps are linear, and we see that y(c) is conjugate relative to y(0) iff Ac is singular, whereas 
it is focal iff B. is singular. Despite this difference, Theorem 5.12 applies, mutatis mutandis: 


Theorem 6.6 A timelike geodesic y: |—a,0| — M in a congruence as above locally maximizes 
the length of (pd) curves from y(0) to y(—a) iff there is no focal point y(c) on y, c € |—a,0]. 


The proof is the same as for Theorem 5.12, since Synge’s formula (5.21) still holds. This is 
remarkable in itself, since in its derivation one now picks up boundary terms at a, because this 
time J (0) Æ 0. There are two: first, after (5.18) one needs to add —87(0) (Vs, ý), which equals 


d 
80, (Vs) = ZEN + 80) (1 Vst) = 80) (VY) = 80) UV), (6.47) 


since in our case 8,9) (Y', Y) = 8y(0) (Ju) = 0, and we also used (5.5). Second, after (5.20) one 
picks up -gy(Y1,VıYL) = —8,(0) (J, VJ), which fortunately cancels the term in (6.47). 

We now relate the existence of focal points to the expansion @ of the congruence. It follows 
from Proposition 5.2, which makes J(t) depend linearly on the initial conditions J(0) and J(0), 


and eq. (6.43), according to which J(0) depends linearly on J(0), that if J € J), then 
Ji(t) = Aij(1)Jj(0) (r € [~a,0]), (6.48) 
for some 3 x 3 matrix A(t). From (6.39) and(6.48) we obtain 
J'(t) = Àit) JI (0) = kij (t) JI (t) = kij(t)A k(t) JE (0), (6.49) 
so that Aix = k;jA jx, or k = ÄA |, and hence, since @ = tr(k) = —tr(k), we finally obtain 
0 =tr(AA~!). (6.50) 


Now A is finite along y, and so is A. Hence in the scenario just considered, @ can only blow up 
at t = c iff A(c)~! does, i.e., iff A(c), which equals the identity at t = 0, has an eigenvalue zero. 


But as explained after (6.46), this implies that there exists some J € J) for which J(c) = 0, 
which means that y(c) is a focal point with respect to y(0). So if @(y(0)) > 0, then y(c) is a 
focal point with respect to y(0) iff lim;_,- 0 (t) = œ. Therefore, the argument after (6.34) gives: 


Proposition 6.7 Let y be an element of the congruence of timelike geodesics orthogonal to a 
spacelike hypersurface } C M. If the positive curvature condition (6.34) holds along y, and if 
0(y(0)) > 0 somewhere along y, then y has an earlier focal point y(c) relative to y(0), provided 
that the geodesic in question can indeed be extended (backward) from t = 0 all the way to t =c. 
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Similarly, we need a corollary to Theorem 5.30 describing the case where x is connected to & 
(rather than to a given single point y) by a length-maximizing timelike geodesic. 


Corollary 6.8 Let & CM be a spacelike Cauchy hypersurface. For any x € I~ (X) there is a 
(not necessarily unique) future-directed timelike geodesic from x to & that maximizes length 
among all timelike geodesic from x to }Ł. This geodesic necessarily crosses & orthogonally. 


Proof. It is easy to see that J+ (x) MX is compact. Indeed, even J* (x) N D7 (£) is compact.””! 


Furthermore, if (M,g) is globally hyperbolic, then the Lorentzian distance function dz defined 
by (5.118) is continuous in both arguments.””” For any y € J+ (x) MX, Theorem 5.30 gives a 
length-maximizing causal geodesic from x to y whose length equals dz (x,y), so keeing x fixed 
we have a continuous function of y that assumes a maximum on the compact set JY (x) NX, say 
at yo. Since x € I” (%), the maximizing geodesic y from x to yo must be timelike by Proposition 
5.13. Finally, the boundary term |?g(y’, ý) in (5.15), which has to vanish, shows that y crosses £ 
orthogonally, since the variation y’ vanishes at a = 0 and is tangent to È at b. 


At last, we are now in a position to prove Hawking’s Theorem 6.4. It is sufficient to prove it 
for timelike geodesics normally emanating from È}, since by Corollary 6.8 timelike geodesics 
not starting normally from % are shorter than those which do. The proof is by contradiction. 


1. Take some x € J~ (X) and a length-maximizing timelike geodesic y from x to yo € È, as in 
the proof of Corollary 6.8. Suppose L(y) > 3/|Ho|. 


2. Since y crosses & orthogonally, it is a member of the congruence described in the previous 
section, and so by Proposition 6.7, y will have focal points (backward in time). 


3. By Theorem 6.6, the length-maximixing Y cannot have any focal points. 


4. Hence y cannot exist: it must have stopped before t = —3/|Hpo|. Hence it is incomplete 
and the contradiction is resolved because y never reaches its would-be focal point. 


5. Any y € X can be reached in this way, and as already noted, the conclusion that normal 
timelike geodesics are incomplete implies the same conclusion for any timelike geodesic 
starting at &, with the same bound (since they are shorter than the normal ones). 


Here assumption | in the theorem (i.e. global hyperbolicity) is used (via Corollary 6.8) in step 1, 
whereas the two curvature assumptions are exploited in step 2. Note that the time to reach the 
singularity increases as the mean extrinsic curvature & decreases, in accordance with intuition: 
less curvature means less focusing (and a higher age of the universe). Of course, timelike 
geodesic incompleteness can still be proved if the uniform bound on the extrinsic curvature is 
replaced by local bounds, but the stated version of Hawking’s theorem is meant to describe the 
big bang, where every indextendible past-directed timelike geodesic ends.””* 


21 See O'Neill (1983), Lemma 14.40 for the even more general statement that for any achronal set A and x € 
int(D(A))\J* (A), the set Jt (x) M D7 (A) is compact. Correct intuition is obtained by taking a future inextendible 
fd timelike curve c from x crossing &. Then J~ (c(t)) eventually covers all of M as t — œ, so that there is tọ such 
that J+ (x) AE C J- (c(to)). Hence Jt (x) NX C J- (c(to))NJ* (x), which is compact by global hyperbolicity. 
Since X and J* (x) are closed, this implies that J” (x) N È is a closed subset of a compact set, so that it is compact. 

292 See, with increasing eye for detail, Hawking (1966/2014), pp. 482-483, O’ Neill (1983), Lemma 14.21, and 
Minguzzi (2019), Theorems 3.48 and 4.124. The key point is that dz is always lower semicontinuous, upon which 
global hyperbolicity also makes it upper semicontinuous, using upper semicontinuity of L(-) and Lemma 5.26. 

93 Theorem 5 in Hawking (1966/2014) is a singularity theorem under weaker conditions, keeping the two curvature 
assumptions but replacing global hyperbolicity by the mere existence of a compact spacelike hypersurface. The 
version above is Hawking’ Theorem 3. See also O’ Neill (1983), Theorems 14.55A and 14.55B. 
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6.3 Null congruences and trapped surfaces 


This section prepares for Penrose’s singularity theorem from 1965. Whereas Hawking’s theorem 
is based on timelike geodesics and more generally on ‘timelike reasoning’, Penrose’s theorem 
(indeed his entire approach to GR, cf. §1.9), is based on ‘lightlike reasoning’, based on null 
geometry. This requires a preamble, which we started in §4.6 and now continue.””* 


Proposition 6.9 FÜ C M is a null hypersurface with normal null vector field 
I y E LANZ) CEZ] g(LL)=0, (6.51) 


then the flow lines of L are lightlike pregeodesics and hence È is ruled by lightlike geodesics.” 


To prove this, for any X € TX, so that g(X,L) = 0, we compute (omitting the subscript x € 2): 


0 = Lg(X,L) = (Vig) (X.L) +8(V1X,L) +8(X, VLL) = 8(V1X,L) + 8(X, VLL); 
=> g(VLL,X) = -g(L,VıX) = -g(L,VxL) — 9(L, |L,X]) =0, (6.52) 


where we used torsionlessness of the Levi-Civita connection V, as well as the computations 


g(L, VxL) = 5Xg(L,L) = 0; (6.53) 
g(L, [L,X]) =g(L4X) =0, (6.54) 


as follows from (3.50) and (2.35), respectively; if L € TX and X € TX, then also Y_X € TX and 
hence this vector is orthogonal to L. Eq. (6.52) implies that VL is orthogonal to every vector X 
tangent to &, and hence must be proportional to its normal L. By Proposition 3.8, the flow of L 
may therefore be (re)parametrized so as to be geodesic. See also Theorem 5.5.2. 


As a special case, consider a hypersurface & = {u = c} (locally) defined by a smooth function 
u, where c € R (or rather a family thereof). Then N = L = Vu, so if? is null, then u satisfies 


g(Vu, Vu) = 0. (6.55) 


This is the basic eikonal equation of hyperbolic PDEs, and u is called an optical function. For 
example, the coordinate functions u = t — r and v = t +r on Minkowski space-time are optical. 


Lemma 6.10 Jf u is an optical function, then the flow of L = Vu consists of lightlike geodesics. 
From Vy duu = Vu Oyu for any function u (since V is torsion-free), for any vector field X, 
XY LEV youu = XYL" V oyu, (6.56) 


i.e. g(L,VxL) = g(X,VıL), where L = Vu. By (6.53), which is true for any X (not necessarily 
in TX) as long as g(L,L) = 0, this gives 


g(X,VıL) =0 (6.57) 


for any X, whereas the previous lemma merely showed this for X € TX. Hence V_L = 0. Thus 
the flow of L consists of lightlike geodesics (without the need for reparametrization). 


294 Kupeli (1987), Aretakis (2013), and Galloway (2014, 2017) are useful introductions to null geometry. 
2957 e, a unique lightlike geodesic passes through every point of £. Conversely, a hypersurface is null if it is ruled 
by lightlike geodesics and locally achronal, see Kupeli (1987), Theorem 1 and Minguzzi (2019), Theorem 6.7. 
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A section of a null hypersurface % is a two-dimensional surface S C % such that 7,5 is spacelike 
(i.e. Riemannian) for each x € S$; hence TŁ = T,S® IR-L. In 2+1-dimensional Minkowski 
space-time, sections of forward or backward lightcones are circles and, less easily visualized, in 
d = 3 + 1 they are two-spheres. Conversely, start from an oriented closed surface $ C M (i.e. S 
is 2d, compact, and without boundary-in this context one may think of the two-sphere S?). At 
each x € S, the orthogonal complement (7,5) has signature (—+) and hence is spanned by two 
future-directed lightlike vectors Ly and L,. These lightlike vectors may be normalized by 


8x(Lx, Ly) = —2, (6.58) 

and together with any basis (ei, ez) of T,S they form a basis of T,M. For example, the pair 
L:= 20, = 0,+0,; (6.59) 
L=2),=-0=4, (6.60) 


does the job in Minkowski space-time (M, n ). In that case, the family (ZL,) is directed outward 
and diverges off to infinity, whereas the other one, viz. (L,), is directed inward and converges to 
an apex like a Chinese hat. But in general space-times this may not be the case; e.g. inside a 
black hole both families bend inwards and one has a trapped surface (cf. Definition 6.13). 
At any x € S, consider the fd lightlike geodesic y” with Y ) (0) =x and ff (0) = Ly. These 
geodesics collectively form a null congruence emanating from S, that is, a hypersurface 
C:=VUUYeYo=Us, (6.61) 
xeESt>0 t>0 


where 
S:= Un) (6.62) 
xeS 
is the image of S = So at time f under the geodesic flow in question, as long as it is defined. Note 
that this is not really a hypersurface as we defined it, because it has a boundary 
oC =S. (6.63) 


Minkowski space-time shows that C may develop conical singularities at finite t and hence may 
be a (smooth) surface only up to some tp. In that case, except for the boundary (6.63): 


Proposition 6.11 The set C defined by (6.61) is a null hypersurface as long as it is smooth. 


First, the lightlike vector field L may be extended from S to C in the obvious way, namely by 
Lig =H): (6.64) 
(x) 


so that, since each ve is a geodesic, V;L = 0. If we push forward any X, € T,S to Ts) C by 
L 


(t) 
the flow of L, then the ensuing vector field X along y” satisfies Y,X = [L,X] = 0, so that 


i 
TEP ALL ET OPO) TEP HP wr Pw? 


= 0. (6.65) 
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Therefore, for fixed x € S, the function 
t> 2 w (L aX a 6.66 
2.9) Ee oro) ia 
is constant and hence equal to its value at t = 0, i.e., at S, where it vanishes (since L is normal to 
S). Since g(L, L) = 0, this is true also for X = L, so that L is also orthogonal to C. This makes C 
(or rather C\S) a null hypersurface by definition. 


Thus null hypersurfaces may either be constructed from optical functions or from spaceike 
2-surfaces. The former is relevant to the Cauchy problem, whereas the latter applies to Pen- 
rose’s singularity theorem, to which we now slowly turn. Compared to Hawking,’s the three- 
dimensional spacelike hypersurface % is replaced by a 2d closed spacelike surface S (with special 
properties to be defined), from which one proceeds as explained above: 


e Construct a null hypersurface C with normal vector field L, where SCC CM; 


e At each x € C, construct a basis (e1,e2,L, L), normalized, also repeating (6.58), by 


glerej) = 6; (Lj = 1,2); gle L) = g(e;L)=0 (i= 1,2); (6.67) 
g(L,L) = g(L,L) = 0; g(LL)=-2. (6.68) 


This can be done by a slight refinement of the construction used for the spacelike case: Starting 
at x € S, seen as the initial point of a lightlike geodesic y (-) = y as above, and defining the 
basis at x, extend L and L as explained above, and extend (eı,e2) by solving 
Vre; = —9(ViL,L)L. (6.69) 

The definition of Jacobi fields along y obtained by varying y within the congruence of all lightlike 
geodesics emanating from S, as in §6.3, is then entirely similar to the spacelike case. If 

SU; g(J,L) = g(J,L) =0 (6.70) 
along y, then J(t) = Y?_,J;(t)e;(t) satisfies Jacobi’s equation (6.41), this time with a matrix 


Oj; = g(e;,O(L,e;)L) =R(e,Lej;L). (6.71) 


The computations leading to (6.38) and (6.39) may also be redone, mutatis mutandis:””® 


(ViJ)i(t) = 8 yr) (Vid (t) eilt)) = L(g yr) J (1), e:(¢))) — 8 yr) F(t), Vrei(t)) 
= Ji(t) + 8y) (VLO) LO): YIOsy (elt). LE)) =F): 


Ji(t) = 8ye) (Vid (t),e:(t)) = 8y) (VIL(t) eil(t)) = Ji (By) (Y jult),e:(t)) 
= kij(t)Jj(¢), (6.72) 


where (6.40) is replaced by 
kij(t) = 8ye) (VjL(@), ei(t)). (6.73) 
Since C (or the S;) is given, the distribution spanned by e; and e> is integrable and Frobenius’s 
theorem again gives k;; = kji, where i = 1,2 instead of i = 1,2,3 as in $6.1, cf. Proposition 6.3. 
For Penrose’s singularity theorem we need the following variation on Definition 6.5, in which 
ss) (replacing jo) in Definition 6.5) denotes the space of Jacobi fields along y satisfying (6.70), 
or, equivalently, (6.41) with (6.71), on the initial conditions (6.70) and (6.72) at t = a. 


296One may introduce the form k on a null hypersurface £ with normal L in a basis-independent way, namely as a 
bilinear form on T,X. / Kx, where K, is the linear span of Ly (x € X). The Lorentzian metric g, which is degenerate on 
T,X, induces a nondegenerate (and Riemannian) metric h on Tx} / Kx, in terms of which k([X], [Y]) = h([VxL], [Y]). 
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Definition 6.12 A point y = y(c), where a < c < b, is focal relative to x = y(a), seen as a 
member of the congruence of lightlike geodesics emanating from S and lying in C, if there is a 


nonzero Jacobi field J € J) satisfying J(c) = 0, as well as (6.70) and (6.72) att =a. 


The smoothness of C breaks down at focal points. The simplest example is Minkowski space- 
time (best visualized in dimension d = 2 + 1, where C is a spacelike circle): either follow the 
geodesics back in time, or use the ingoing lightlike L rather than L. All lightlike geodesics then 
have the same focal point. If we now define the null expansion 0 on C by 


0 :=tr(k), (6.74) 


then the arguments leading from (6.48) to (6.50) may be repeated verbatim, and so it follows that 
y(c) is a focal point iff the scalar blows up at c. Before giving conditions for this to happen, we 
first investigate its geometrical meaning. Take coordinates (ch a) on S (for example, if S = S2, 


the usual angles (x! = @,x* = 0), so that the volume of S, is given by 
Area(S;) := / 


where hy) is the metric on S; C M induced by the metric g on M, i.e. hy) (X,Y) = 871) (X.Y) 
for X,Y € Tyst. If d/dt is the directional derivative along L, the key formula, then, is 


dx'dx? \[dethy, (x!,x?) = / du,(x!,x2), (6.75) 
St St 


dArea(S. 
ea(S;) = i du 0(t), (6.76) 
dt S, 
so that 8 measures the rate of change of the area of S, when moving along the geodesic yr. In 
particular, this area shrinks when @ < 0 and decreases to zero at a focal point, where 0 = —oo. 
To prove (6.76), we interpret k;;(t) as defined in (6.73) as a (symmetric) bilinear map 
k(t): Tyr) St X TyaySı > R; k(X,Y) :=g(VyL,X), (6.77) 
so that we may redefine 9 by rewriting (6.74) with respect to an arbitrary coordinate basis as 
A(t) = Yh (t)ky(t). (6.78) 
IJ=1 


As usual, in this formula h’ is the inverse matrix to hy; = h(d/dx!,0/dx’) for any (local) 
coordinates (x!,x”) on S;. Using (3.73) with X = L, eq. (6.77), and the symmetry of k, yield 


Lrhiy(t) = h(t) = 2ky(t), (6.79) 


upon which the elementary computation (7.12) and text below, applied to h, gives (6.76): 


d 
FV tthe = ay thy h” hist) = /dethya) h” (kurt) = y/dethy) O(t). (6.80) 


Exactly the same constructions apply to the null hypersurface C that is obtained from the given 
surface S by replacing L by L (and vice versa) throughout (6.61) - (6.76). For example, if in 
Minkowski space-time C is erected from the outgoing lightlike directions, then C is built from 
the ingoing ones, so that a focal point arises in the future as a Chinese hat. Thus we define 


s, = U x” 0); c=US; (6.81) 


xes t>0 


K(X,Y) = g(VyL,X); @ =tr(k). (6.82) 
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In Minkowski space-time the lightlike vectors (6.59) - (6.60) are easily shown to give 
O(t,r,0,9) =2/r; O(t,r,0,@) = —2/r. (6.83) 
Assumption 3 on & in Hawking’s Theorem 6.4 is now replaced by Penrose’s assumption on S: 


Definition 6.13 A future trapped surface is a closed spacelike surface S C M with 
O(x) <0; A(x) <0 (6.84) 
for all x € S, where @ and 8 are defined by (6.74) and (6.82), with L and L future directed.” 


This “infinitesimally” states that all fd lightlike geodesics emanating orthogonally from S bend 
inwards, not just-as in Minkowski space-time- those along L.””® The picture on the next page 
illustrates this. The signs of @ and @ depend on the time-orientation of the lightlike vectors L 
and L; there is an analogous concept of past trapped surfaces. An appealing interpretation of 
Definition 6.13 follows from a slight generalization of (6.76). Let X be some convex combination 


of L and L and let sf) be the image of S under the flow @, of X, t > 0, so that S, = se. Then 


(X) 
a =- [| dus, oL) + A(QL(0), oe) 


so that, the area of a trapped surface S decreases in any future orthogonal direction. 
Further changes from Hawking’s setting to Penrose’s are: 


e The spacelike (3d) hypersurface & is replaced by a closed spacelike surface S; 
e For the normal u one may take either L or L (as both vectors are orthogonal to S); 


e The expressions (6.10) - (6.16) now become 


A! := LYV,L"; (6.86) 

hy := 6) +1 (LHL, + L" Ly); (6.87) 
t= N plas (6.88) 
Ouv = kjuv]: (6.89) 
Ouv = k(uv) — 5Ohuy (6.90) 

0 := gh kay = he’ kuv = tr(k); (6.91) 

> kuv = 5Ohyy + Ouv + Ouv. (6.92) 


We have 1/2 in (6.90) and (6.92) as opposed to the 1/3 in (6.14) and (6.16) because o is 
the traceless part of k (its trace already being taken care of by 0), and this time, 


nö + llelLL)+glLZ)) =4=1=-1=2= dim(S). (6.93) 
H u 2 


See Hawking & Ellis (1973), chapter 9, for an early analysis of trapped surfaces in causal theory per se. The 
study of trapped surface formation from the PDE point of view began with Schoen & Yau (1983), who gave initial 
values that already contain trapped surface; see also Alaee, Lesourd, & Yau (2019). Christodoulou (1991, 1999a, 
2009) first proved the evolution of asymptotically flat initial data into trapped surfaces. See also follow-ups by 
Klainerman & Rodnianski (2012) and Klainerman, Luk, & Rodnianski (2014), and reviews by Dafermos (2012) 
and Bieri (2018). For the incorporation of more realistic matter models see e.g. Burtscher & LeFloch (2014) and 
Burtscher (2020). Other literature may be traced back from Li & Yu (2015) and Athanasiou & Lesourd (2020). 

98 Senovilla (1998), §4, gives many examples. In the presence of a radial coordinate r as in the Schwarzschild 
solution, this condition is equivalent to the gradient Vr being timelike, which happens for r < 2m. 
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Picture of a collapsing star, here shown to illustrate Penrose’s concept of a trapped surface. Far 
from the collapsing matter, the local lightcones are as in Minkowski space-time, with © > 0 and 
@ <0, so that the area of a surface (represented by a circle in this picture) grows along L and 
shrinks along L. While 8 < 0 everywhere, 0 vanishes at r = 2m and then changes sign to 0 < 0 
forO <r < 2m, so that the circles shrink in both directions and the lightcones become thinner. The 
key point is that 0 < 0 indicates that the originally outgoing light-rays along L now bend inwards 
and cannot escape, justifying the relevance to black holes. Taken from Penrose (1969), Figure 2. 
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e We now assume that the lightlike vector fields L and L are normalized such that 


V;L=0; VıL=0. (6.94) 


The lightlike version of the expression (6.17) is then given by 


which is easily verified by computing all contractions with L, L, and e;, using (6.88), (6.87), 
and (6.68). Another useful result,?”” still assuming (6.94), is 


0 = VaL"; 0 = VaL". (6.96) 
To see this, one again uses (6.88), (6.87), and (6.68). For example, we compute 
6 = hhh Y ple =i Valo (l FUL APE) Vala 

= VaL" + 4(g(L, ViL) + g(L,VLL)) = VaL", (6.97) 

since g(L,L) = 0 implies g(L, VLL) = 0 by a calculation like (6.52), and we had (6.94). 
e If My = 0, see the comment after (6.73), as well as (6.94), the same argument as in the 
timelike case then implies Raychaudhuri’s equation along the outward directions L, viz.° 
V10 = 0 = -10? — Opyo” — Ruy LL”. (6.98) 


This replaces (6.27), with a similar derivation. First, as in (6.26) we obtain 
Vı(VuLv) — —(VyL°)VoLy + RypopLl°L? (6.99) 


straight from the derivation of the Riemann tensor and the property A = 0. We then 
substitute (6.95), contract with g“”, and substituting (6.92), with @ = 0. Analogous 
reasoning then leads to the following null counterpart of Proposition 6.7:°°! 


Proposition 6.14 Let S C M be a closed spacelike surface, let x € S, and let y = y” be a 


lightlike geodesic as above. If 
(x) <0; RuvI”“LY > 0 (6.100) 


along y, then y has a (later) focal point y(c) relative to x = y(a), provided that y can indeed be 
(L) 


L 
extended from a to c. The same statement holds for Y = y , assuming that along y we have 


@(x) < 0; a S10) (6.101) 


299Fq. (6.96) is a quick way to verify (6.83), e.g. 0 = (Tho +Tr) = (0+T8,+T$,) = (1/r+1/r) =2/r. 

300One also has a similar equation for the Weingarten map W associated to k, but in the absence of an orthogonal 
projection 7,C — T,S, where x € C, one has to replace 7,5 by ÑS := T,C/R-L,, with associated projection 
T,C > ÑS, X + X. Defining Wy : TS — Ty S by W,. (X) := —VyL, one can show that along yr, one has W = W? +R, 


where R, : ÅS — TxS is defined by R(X) := O.(X,L)L. This also yields (6.98). See e.g. Galloway (2017). 
30! See Hawking & Ellis (1973), Propositions 4.4.4 to 4.4.6, for details. 
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6.4 Penrose’s singularity theorem 


We now come to one of the highlights of (mathematical) GR. Proposition 6.14 will lead to a 
contradiction with global hyperbolicity, as in Hawking’s theorem, provided some and hence 
all Cauchy hypersurfaces & are non-compact. If one envisages applications to black holes in 
asymptotically flat space-times (to be defined), then this seems a reasonable assumption. 


Theorem 6.15 Let (M,g) be globally hyperbolic with non-compact Cauchy surface &. Assume: 


1. One has Ryyy" ý” = 0 along all lightlike geodesics y (null curvature condition); 
2. The space-time M contains a future trapped surface. 


Then (M,g) has incomplete future-directed lightlike geodesics. 


Proof. The proof is based on properties of the future horismos E*(S) = Jt (S) WT (S). 
Lemma 6.16 Let (M,g) be a globally hyperbolic space-time and let SC M be compact. 
1. If M has a non-compact Cauchy surface, then E+ (S) is non-compact. 


2. If: i) assumptions I and 2 in the theorem hold; ii) S is a trapped surface (which is compact 
by definition); iii) all lightlike geodesics in M are complete, then E* (S) is compact. 


The proofs of both claims rely on the following consequence of global hyperbolicity. 
Lemma 6.17 If (M,g) is globally hyperbolic and S C M is compact, then 


E+(S) = ðI” (S). (6.102) 


Proof. This follows from Lemma 5.29, which makes J? (S) closed, and then (5.148). 


Recall that a hypersurface A C M is achronal iff each timelike curve intersects it at most once, 
cf. (5.145). Furthermore,30? F C M is a future set if I+ (F) C F (in other words, I* (x) C F for 
all x € F), in which case OF is called an achronal boundary. Clearly, F = I+ (S) is a future 
set, and hence d/* (S ) is an achronal boundary; see (5.146) etc. for the proof that it is indeed 
achronal (the proof for general F, which we do not need, is similar). Since 7 T (S ) is open, its 
boundary has codimension one (i.e. is 3d in 4d space-time) where it is smooth. The following 
more specific result will also be important for the analysis of event horizons (cf. §10.7): 


Proposition 6.18 Achronal boundaries are locally Lipschitz topological hypersurfaces in M. 


Knowing this,’ the most rigorous way to prove part 1 of Lemma 6.16 is to use the following: 


Proposition 6.19 Let (M, g) be a globally hyperbolic space-time. Then any compact achronal 
topological hypersurface & in M is a Cauchy surface. 


302Penrose (1972) defines future sets through J+ (F) = F. We have equality in /*(F) C F iff F is open. 

303 See Minguzzi (2019), Theorem 2.87 Gii). To define the Lipschitz condition we equip M with a complete 
Riemannian metric, so that it also becomes a metric space, and ask the map @ : U — V in (4.123) to be bi-Lipschitz 
(i.e. Lipschitz with Lipschitz inverse). See e.g. Naumann & Simader (2011), §2.1. The simplest examples, such as 
S = ðI” (0) in M, whose boundary is not smooth at the apex show that achronal boundaries need not be smooth. 
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Proof (sketch).°"* First, the definition of a hypersurface implies that it has no boundary, which 
implies that J+ (=) UJ- (2) is open. If X is compact, then J*(%) is closed by Lemma 5.29 and 
the assumption of global hyperbolicity. Hence J+ (X=) UJ~ (X) is both open and closed, and 
since our space-times M are connected by definition, we must have 


yous (£) =M. (6.103) 


So any x € M must lie either in J*(%) or in J* (2), or on both, i.e. in £, which trivial case we 
exclude. Suppose x € J* (Ł) and consider a past inextendible timelike curve c ending at x. If c 
does not intersect X, then it must stay in J+ (X) N J7 (x). This set is compact by Lemma 5.29, so 
the curve is imprisoned, which is impossible by Definition 5.27 of global hyperbolicity. 


In view of Theorem 5.33.2, which excludes the possibility of M having one Cauchy surface that 
is compact and another that is not, Propositions 6.18 and 6.19 clearly imply Lemma 6.16.1.305 
We now turn to the second part of Lemma 6.16. Given (6.102), the idea is to use Corollary 
5.16. The lightlike geodesics ruling 0J*(S) come from both L and L; in d = 2 + 1 a picture 
where the trapped “surface” is just a circle shows this very clearly. This obviously gives the 
inclusion I/*($) C CUC, see (6.61) and (6.81). The key is a refinement of this inclusion: 


Lemma 6.20 For any closed spacelike surface S in a globally hyperbolic space-time (M,g), 
with associated null surfaces C and C defined by (6.61) and (6.81), one has 
BS) aCe UC eC UC ECIS): (6.104) 


~reg 


where Creg C C is the regular (and hence smooth) part of C, and C,., is defined likewise. More 
precisely, Creg is defined as the subset consisting of the parts of all (fd) lightlike geodesics in C 
starting in S before their first focal points (if any) are encountered (and similarly for Ceg). 


The first inclusion in (6.104) follows from the counterpart of Theorem 6.6 for lightlike geodesics: 


Theorem 6.21 A lightlike geodesic y: [a,b] > M in C or C (with y(a) € S) locally maximizes 
the length of causal curves from y(a) to y(b) (not necessarily in C or C) iff there is no focal point 
y(c) on Y, a<c <b. Therefore, if there is an intermediate focal point, then y(b) € I* (y(a)). 


This is a “lightlike” adaptation of Theorem 5.12, whose long proof we omit.°°° We do note that 
lightlike geodesic can only maximize length if all other comparable causal curves (between the 
given endpoints) have zero length, too. This is why focal points and the existence of (longer) 
timelike curves go hand in hand. See also the comment after the proof of Proposition 5.9. 


304For complete proofs see Budic et al. (1978), Theorem 1, and Galloway (1985), Theorem 1 and Corollary 1. 

305 Penrose (1965) himself, followed by Hawking & Ellis (1973), §8.2, Theorem 1, used a different argument: 
Since M has Cauchy surface U and E+ (S) is achronal, via the flow of a complete timelike vector field (which exists 
because space-times are time orientable), any x € EY (S) projects onto a unique point of È. This gives a continuous 
injective map m: Et (S) — È, which is a homeomorphism onto its image r(E* (S)) in È (recall that any continuous 
bijection from a compact space to a Hausdorff space has a continuous inverse). Since Et (S), being a boundary 
itself by (6.102), has no boundary, its homeomorphic image 2(E* (S)) has no boundary either. Similarly, if E* (S) 
is compact, then z(E* (S)) is compact, too. In that case the non-compact (sub)manifold X would have a compact 
submanifold of the same dimension without boundary, which is impossible (Aretakis, 2013, 85.5): any x € 1(ET(S)) 
would then have an open nbhd U which is also open in %, since dim(E* (S)) = dima(E* (S)) = dim(X) = 3. 
Hence (E+ (S)) would be open in È, but it is also closed since it is compact. Since È is connected (by Proposition 
5.33.1) this implies z(E* (S)) =X, which is impossible because & is not compact whilst z(E* (S)) is. 

306 See Hawking & Ellis (1973), Proposition 4.5.10 to 4.5.14, O’ Neill (1983), Propositions 10.46 to 10.48, or 
Kriele (1999), Lemma 4.6.15, Theorem 4.6.2(iii) and Corollary 8.3.1, and Minguzzi (2019), Theorem 6.16. 
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If now y € C lies beyond a focal point on some lightlike geodesic yf? in C, then by Corollary 


5.16 it cannot lie in 9/*(S), since now there is a timelike curve from x to y, and likewise for 
L and C. This gives the inclusion E* (S) C Creg U Creg- Now suppose that all assumptions in 
Lemma 6.16.2 hold. Then Proposition 6.14 applies. Each (lightlike) geodesic y in C reaches 
its first focal point in finite time ty; if y starts at x € S, then tx(x) < 2/|@(x)|, for which the 
argument is the same as after (6.34), except that in the null Raychaudhuri equation (6.98) one 
has —107 instead of — 10° as in the timelike case (6.27). Since S is compact and 0 (x) < 0 for 
all x € S by definition of a trapped surface, one has © := infyes{|9(x)|} > 0, so that by the time 
tr =2/© < œ each geodesic in C has passed its first focal point. Likewise for L and C, giving 
© = infyes{|O(x)|} > 0 and a time ty = 2/© < œ% playing the same role for C. It follows that 


E*(S) C Creg UCreg C (Urcos) U (Urejos St). (6.105) 


By (6.102), this makes E*(S) a closed subset of a compact set, so that it is compact, which 
proves Lemma 6.16.2. Given the assumptions of Theorem 6.4, the only way out of the contra- 
diction between compactness and non-compactness of E+ (S) is to invalidate the invocation of 
Proposition 6.14 by using its proviso ‘provided that y can indeed be extended from a to c’, which 
would be guaranteed by lightlike geodesics completeness. So this cannot be the case; the proof 
shows that at least one incomplete lightlike geodesic emanates from the trapped surface S. 


Neither Hawking’s nor Penrose’s singularity theorem proves the existence of a singularity 
in the sense of Definition 6.1.3. These theorems only show causal geodesic incompleteness 
and as such they are better called incompleteness theorems. The “singularity” theorems were 
inspired by intuition from the big bang and from black holes, as described by the time-honoured 
Schwarzschild and FLRW solutions (6.1) - (6.3), where some quantity defined via the metric 
becomes singular (that is, infinite or zero). However, running ahead of chapters 9 and 10, 
consider the Kerr solution (9.110) for 0 < |a| < m or even more simply the Reissner-Nordström 
solution (9.88) - (9.89) for 0 < |e| < m, and suppose we do not look at the maximally extended 
solutions but rather at the maximal globally hyperbolic development (MGHD) of atypical maximal 
spacelike (achronal) hypersurface on which the initial data induced by the global solutions are 
given (see $7.6). The picture then changes completely: the ensuing space-times still satisfy the 
assumptions of Penrose’s theorem, but they are not singular in any metric sense, because the 
singularities lie behind the Cauchy horizon of the initial data hypersurface. In this (PDE) picture, 
causal geodesic incompleteness rather means that the space-times in questions are extendible. 
The maximal (analytic) extensions are singular in a metric sense, and in addition they fail to 
be globally hyperbolic (whereas any MGHD is globally hyperbolic by construction). These 
properties are related to each other; see $10.4 in connection with (strong) cosmic censorship. 

Penrose’s theorem is the mother of all singularity theorems in GR and, with Hawking’s 
(which is perhaps the father), also the cleanest one. Apart from the introduction of topological 
methods in GR, which was new at the time, among its main achievements one should already 
count a key definition Penrose introduced in GR, namely that of a trapped surface. 

Nonetheless, there is room for weakening the assumptions in Penrose’s theorem.*?’ The 
most cited way of doing this is the combined Hawking-Penrose singularity theorem:>* 


307 As well as in Hawking’s, where, as already noticed by Hawking (1967) himself, global hyperbolicity may be 
replaced by strong causality, in which case the Cauchy surface in its proof may be replaced by a partial one. 
308 See Hawking & Penrose (1970), Hawking & Ellis (1973), $8.2, Theorem 2, or Senovilla (1998), Theorem 5.6. 
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Theorem 6.22 Let (M,g) be a chronological space-time (i.e. M contains no closed timelike 
curves). If Ruyu"u’ > 0 for every causal vector u, and on top of that the genericity condition 


YoRpiyep to" #0 (6.106) 
holds in at least one point of every causal geodesic y, and at least one of the following is present: 
I. A compact edgeless achronal set; 2. A closed trapped surface; 
3. A point x E€ M such that the lightlike geodesics from x are focused and reconverge, 
then the space-time in question is causally geodesically incomplete. 


The main achievement of this theorem is that global hyperbolicity has been weakened, and that 
the assumptions in Hawking’s and Penrose’s theorems are somehow combined. But the price 
is high: the assumption (6.106) is contrived and purpose-driven, and in addition the theorem 
does not so much strengthen as weaken Penrose’s theorem:”"” because of the choice menu in its 
assumptions, in the case of say a black hole in an expanding universe the theorem may point 
towards the big bang singularity whilst saying nothing about the black hole singularity, or vice 
versa; whereas the separate theorems of Hawking and Penrose would identify both. Instead, a 
real and useful strengthening of Penrose’s theorem is given by Minguzzi’s singularity theorem: 


Theorem 6.23 Let a space-time (M,g) satisfy assumptions 1 and 2 in Theorem 6.15, and also: 
1. I*(x) CI*(y) implies I (y) CT (x) (i.e, (M,g) is past reflecting); 
2. M does not contain a compact spacelike hypersurface. 

Then (M,g) has incomplete future-directed lightlike geodesics. 


Condition 1 weakens Penrose’s assumption of global hyperbolicity. In view of Proposition 6.19, 
condition 2 weakens his assumption of M of containing a non-compact Cauchy surface. These 
conditions were inspired by black hole evaporation. Other assumptions one may weaken include: 


e the pointwise energy/curvature conditions, which can be replaced by averages;?!” 


e the presence of a trapped surface, which can be replaced by an outer trapped surface;'' 
e the assumption that the space-time is chronological (which e.g. a Kerr black hole is not);?!? 


e the regularity of the metric, so far tacitly assumed smooth.*!° 


30° This point was made by Minguzzi (2020), which is also the source of Theorem 6.23 below. 

310 See Fewster & Galloway (2011), Fewster & Kontou (2020), and Freivogel, Kontou, & Krommydas (2020). 

31! These things are defined in §10.11. Briefly, a (marginally) outer trapped surfaces has 8 < 0 (0 = 0), irrespective 
of the sign of @, where L is outward pointing (provided this can be defined). Outer trapped surfaces appear in 
the topological singularity theorem of Gannon (1975) and Lee (1976), in which assumption 2 in Theorem 6.15 is 
replaced by the mere assumption that % is non-simply connected. See also Galloway (2017), Theorem 3.3. The 
proof constructs an outer trapped surface in the universal cover &. Eichmair, Galloway, & Pollack (2013) showed 
that, assuming the genericity condition (6.106), condition 2 in Theorem 6.15 may be replaced by the presence of a 
marginally outer trapped surface in the Cauchy surface X. In a variation on this result, Chruściel & Galloway (2014) 
replaced (6.106) by assuming that the second fundamental form k defined in (6.77) is not identically zero. 

312See Lesourd (2018). Another interesting (cosmological) singularity theorem of his is in Lesourd (2019). 

313 See Graf et al. (2018) and references therein. One can go down to C L1 (i.e. derivatives are locally Lipschitz). 
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7 The Einstein equations 
As noticed independently by Einstein and Hilbert in 1916, the Einstein equations 


whose left-hand side we now understand, and whose right-hand side will be explained in $7.3, 
can be derived from a variational principle. The geometrical quantity to be extremized in order 
to obtain the left-hand side is the Einstein—Hilbert action for the gravitational field, given by 


Sc(g) = | dry -80RO), (7.2) 


where R = g"’R,v is the Ricci scalar. To be on the safe side in so far as convergence of integrals 
is concerned, V C M is a compact region in space-time M with open interior (or, equivalently, an 
open region with compact closure-this does not matter with respect to a measure like d*x), and 
g = det(g) is the determinant of the matrix guy (in any basis), cf. (3.14) - (3.15). 

Eq. (7.2) will later be supplemented by boundary terms, which are needed in case things on 
OV are not under control; for the moment we omit these (life without them is difficult enough!). 


7.1 Integration on manifolds 


To make sense of (7.2) and its variation we need some integration theory, for which we assume 
some familiarity with the calculus of differential forms.°!* For simplicity we assume that M is 
orientable, which means that there is an atlas (within the equivalence class of atlases defining 
the manifold, cf. $2.1) for which all transition functions g 09% ! have positive Jacobian. An 
orientation of an orientable manifold is the equivalence class of an atlas satisfying this condition. 
It can be shown that M is orientable iff it admits a nowhere vanishing n-form œ € D"(M). Such 
an @ defines an orientation: one only accepts charts @ whose coordinates (x!,...,x’’) satisfy 


@(d1,...,0n) > 0. (7.3) 


In the presence of a metric we normalize @ (which can be multiplied by an arbitrary smooth 
strictly positive function) such that in some, and hence in all coordinates one has, equivalently, 


@(d1,...,0n) = viel; 


@, = 4/|g(x)|dx! A- Ad". (7.4) 


314See e.g. Choquet-Bruhat & DeWitt-Morette (1982), chapter IV (a book I devoured as a student). Briefly, if 
dim(M) =n and 0 < p <n, a p-form on M is a totally antisymmetric element of ¥(°) (M), cf. §2.5. These form 
a C*(M)-submodule of ¥'-°) (M), called QP (M) or AP (M). One has multilinear maps ^ : OP (M) x QI(M) > 
Q?*4(M) defined by concatenation followed by total antisymmetrization, called exterior multiplication, as well 
as linear maps d : QP (M) — OP*+!(M), called the exterior derivative, which are uniquely characterized by the 
properties: i) eq. (2.56) at p = 0; ii) d(a A B) = (da) AB + (—1)?a AdB, where œ € OP(M); iii) d? = 0; 
iv) locality, in the sense that if & = B on some U € @(M), then da = dB on U. In coordinates one then has 
(dO) pr = Au %y---up4]- Tt follows that dim(O%(M)) = 1 for all x € M, with basis dx! \.---\dx". Finally, 
each vector field X € X(M) defines insertion maps iy : QP (M) — O?~!(M) that are uniquely characterized by the 
following properties: i) ix = 0 on N°(M) = C” (M); ii) ix = 0 (X) for 0 € Q! (M) = O(M); iii) ix(@AB) = 
(ixa) A B + (—1)?a AixB, where again œ € QP (M). In coordinates, one has (ix) u-u, =X" Ou u-u- 
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This expression is indeed well defined in that @ keeps this form under coordinate transformations. 
To see this, we use the change of coordinates formula (2.77) for the metric, i.e., 


ox! Ax’ 
Su'vi(y) = a). (7.5) 


in which we write y for xg and x for xq, so that y = y(x), and as a matrix we have 
Riel 
ox! oy” 
see, ; (1.6) 
gyk axl 


e(y) = g(x) (de (EN) = g(x) (a (3) i (7.7) 


This gives coordinate-independence of (7.4), as well as the (equivalent) property 


/ 1 —l 
dy"! gy” 
d*yy/|g(y)| = d*x|det (2) u 4 ) = d*xy/\g(x)|. 1.8) 


gx. 
The following formula then either defines the left-hand side or gives a formula for it: 


Í fo = ja axle Olea). (7.9) 


for any f € C?’(M), where @ is defined by (7.4), and the right-hand side should be written as a 
sum over various coordinate patches using a partition of unity. As we saw, the expression 


d*xy/|g(x)| (7.10) 


is invariant under coordinate transformations and hence defines a “geometric” volume element. 
We will encounter boundary terms. First, the divergence of a vector field X is defined as 


Then 


lex) 


V-X =VX". (7.11) 


From now on we assume Lorentzian signature. In any coordinate system we then have 


du V78 = V-8Thp. (7.12) 
Indeed, since the first term in (4.15) cancels the last if v = p, we have 


The right-hand side can be computed by diagonalizing the symmetric invertible matrix (gpo ), 
yielding nonzero eigenvalues (Ao,...,A3). Realizing that (gP°) is the inverse of (gpo) gives 


Auf Auf 
EP 8po,u = Fran n (7.14) 


Eq. (7.12) then follows from the fact that we also have 


VE _ ptg Ao::-As) _ 9uAo |... | Ya 
Ver, a T 


(7.15) 


Integration on manifolds 


149 


For later use (see $7.5) we also put on record an identity that follows from (7.12), viz. 


lv st”) = V—88? Too. (7.16) 
Returning to our theme of boundary terms, eq. (7.12) implies 
V—g V-X = dn (/—gX"), (7.17) 


and hence, by Stokes’s theorem ( = divergence theorem = Gauss’s theorem), 


f dry- v-x@) = | #3.x = | PX (7.18) 


where OV is the boundary of V and dG is the (outward) normal volume element of OV. If 
we use local coordinates (yt, y, y?) on OV, and $ = i*g is the induced metric on OV (where 
i: ƏV > V is the inclusion, so that (X,Y) = g(X,Y) for X,Y tangent to OV), we have 


26 = d°y,/|det(g)|N, (7.19) 


where N is the outward normal to AV (here assumed to be non-null, so that det(2) # 0). 
The royal (i.e. geometric) path to (7.18) is to note that (7.17) takes the abstract form 


Lao=0V-X, (7.20) 
where the volume form @ is given by (7.4), and then use Cartan’s formula 
La = d(ixa) +i(da) (7.21) 


for the Lie derivative of any p-form & € QP” (M), p > 0, where X € X(M). Since œ € 0" (M) 
we must have dw = 0, so that Cartan’s formula for the volume form is 


xo = d(ix@). (7.22) 


With (7.20), this gives œ V -X = d(iy@). The abstract version of Stokes’s theorem reads 


I da = I „a (7.23) 


for any œ € O"(M). Taking & = iy@ and hence da = £x@ gives (7.18) geometrically: 


[ov-x= | ix ®. (7.24) 
V ƏV 


Moreover, the form in (7.19) is 0 = i,,@. Using (7.21) twice, as well as (7.20), we obtain 
L0 = Lyix@ = ind (ix @) = in Zn ® = ig@V-N = oV-N. (7.25) 


This formula will not be used in what follows, but it was used in §6.1 and hence needed proof. 
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7.2 Variation of the Einstein—Hilbert action 


In order to set up the variational calculus around (7.2), as in the geodesic case (§3.2) we now 
consider a family of metrics gs, and compute dSg(g;)/ds. This requires some preparation. 


1. Each of the three terms in the integrand \/—g g"”Ryy in (7.2) depends on the metric guv 
and hence has to be varied. The variation of the Ricci tensor seems the most complicated 
case, but surprisingly it contributes a divergence term and hence makes no contribution to 
the Einstein equations (7.1). This is surprising, since definitions (4.14) and (4.109) give 


Ruv =Vhvp Tup, +l pol vu - TVo 


Bie (7.26) 


whose first two terms contain second-order derivaties of guy. Their variation would 
therefore in principle be expected to give a fourth-order PDE, but this does not happen.*!° 


2. Writing öF(g) = dF (gs) /ds\,9 and d(g;)uv/dsjs=o = Öguv = duv, we claim that 
g"YôRuv = V-X; (7.27) 
ey Vid, (7.28) 


where indices are always raised and lowered with the metric g = g,-0o. In this respect the 
notation dg" is ambiguous, as it could mean either (6g)4” = gP gV° Sgn, or 


dghY = (g!) = —gh Pg’ doa, (7.29) 
which is what we will take it to mean. Then (7.29) follows from g“Y gyp = 5p , and hence 
0 = 8 (8! 8vp) = (Sg4” )avp + gh dyp. (7.30) 
The key step in the proof of (7.27) - (7.28) is the relation 
ST hy = 4(Vyudy + Vydh — VPduv) = 49°? (Vudovt+Vvdop—Voduv). (7.31) 
This can be shown by a lengthy computation, but also by the following instructive trick. 


(a) First note that although the coefficients eee do not form the components of a tensor, 
their variation ST hy does. This is true far more generally: if V and V are connections 
on a vector bundle E, then (Vy — Vx)s is C”(M)-linear in s € T(E) (unlike Vys and 
Vxs), since the spoiler (X f)s in the Leibniz rule (3.56) drops out of the difference. 
As acase in point, let V be the Levi-Civita connection for a given metric g and let V 
be the Levi-Civita connection for some other metric g. We then have a tensor T € 
x21) (M), defined by Î(X,Y,0) = 0 (VxY — VxY), whose connection coefficients 
are ven — Pe cf. (3.37). In particular, we make take = g,, and since 


dT iy (g) = lim(Tiv(gs) — Tiv (8))/s, (7.32) 


= li 
s—0 


we may conclude that the coefficients ST hy form the components of a tensor dT. 


315 Lovelock’s Theorem (Lovelock, 1971; Navarro & Navarro, 2010) states that in d = 4 the Einstein—Hilbert 
action (7.2) is the only possible geometric quantity giving rise to second-order PDE in the guy, except for adding a 
(cosmological) constant A = —A to the Ricci scalar R. See Anderson (1981) for an extension to matter couplings. 
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(b) Let o and T be tensors of the same type, say (1,1) Then o = 7 is true iff for each 
x € M one has o% (x) = T4 (x) in just one specific coordinate system (x) defined on 
some nbhd U of x, which system may even depend on x (like GNC). For in that case 
we have 0,(du,dx”) = T%(Ou,dx”), and so, by C*(M)-linearity of o and T in its 
arguments, 0(X,0) = t(X, 0), where we write X = X” ð, and 0 = Oydx" as usual, 
for some X# € C*(U) and 0, € C*(U). And similarly for tensors of any type (k,l). 
It therefore suffices to verify (7.31) in geodesic normal coordinates, where at x = xo 


we have V = ð, cf. (5.38). In GNC one does not even need (7.29), since ögP° in 
(4.15) multiplies terms that vanish at xg, and hence (7.31) is almost trivial. 


Na 


(c 


Similarly, noting that in GNC the variation 6Ryy only employs the first two terms in (7.26), 
in which 6 Dien) = Ap ST hy (etc.) can be computed from (7.31), one obtains 


ÖRuv = +(VpV udh +VpVvdu — Vu Vyd — V?Vpduy), (7.33) 


where we note that the third term is symmetric in u and v because of (4.13) and (4.37). 
Contraction with g"” then makes the first two terms identical to each other, and similarly, 
the last two. This immediately leads to (7.27) - (7.28). 


3. The computation of 5,/—g is based on the relation dg/dgyy = g"’g,°'° which implies 


o VE 1 Og 1 Lv 
5./_-2 = Ay Sue  — el je div. 734 
5 , uv Wer 7 uv=2 EE uv ( ) 


4. Since we already know ög,v from (7.29), we are finally in a position to compute: 


sole) = SEED (5 = 0) = [ d'ôl V5Ee Ry) 


= [as [(6 V —g)g""Ruv +vV-8 (ög"")Ruv +v =g g" ôR uv] 


= [atx BER RO )duy + | BS" (V" duy —Vudy). (7.35) 
V V 


5. Now a delicate point is that although by definition of the variational principle duy vanishes 
on the boundary OV, this need not be the case for its (covariant) derivatives V’dyy etc. To 
cancel the problematic boundary term in (7.35) one needs to add a boundary term Sp (g) 
to the Einstein—Hilbert action (7.2), giving a gravitational action S = Sg + Sg, where 


(7.36) 
Here £ = g(N,N) equals £ = 1 if OV is timelike and £ = —1 if OV is spacelike; in a 3+1 
split, where V typically looks like the bulk part of a cylinder, OV consists of two timelike 
components that bound V from above and from below, as well as a single spacelike part 
(cf. §8.7). Furthermore, Ñ is the trace of the extrinsic curvature of the embedding OV > V. 
The extrinsic curvature, studied in detail in §4.7, is defined by 


k(X,Y) := —g(VxN.Y), (7.37) 


3!6This follows from linear algebra: 0g/dgyy = m!” , i.e. the minor = cofactor of guy, and g!¥ = m”! /g. 
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where we dropped the arrow on the normal N, and X,Y are tangent to OV; this form turns 
out to be a symmetric tensor k € x20 0V on OV, like the induced metric &, cf. (4.146). 
Nonetheless, there is a convenient space-time calculus for both g and k, defined via 
Suv := Suv — ENu Nv; (7.38) 
kuy i= =g ES V No (7.39) 


whose indices are raised and lowered with g. However, since NuNykuv = 0, the trace is 
Tr (k) = HY (Th VNp —OyNy). (7.40) 


Finally, k° in (7.36) is the extrinsic curvature of the embedding 9V > M, where M is 
Minkowski space-time (its trace is taken with respect to the Minkowski metric). This term 
is necessary for for (7.36) to converge if OV stretches out to spatial infinity. 


In computing the variation of Sg(g), it simplifies matters greatly that duy vanishes on OV, 
as do all its derivatives along OV, so that only its derivatives along N need to be taken into 
account. For example, 6 det(g) vanishes on OV, and for ôTr (k) on OV we find 


Tr (k) = 8HY Np ôT hy = —1BHYNP Onduy, (7.41) 


ôS = 18 1/ |det(g)| SH’ NP Andyy. 7.42 
B(g) E Ay y | det (g)| g pduv ( ) 


On the other hand, on OV where duv = 0, we have, for the boundary term in (7.35), 


so that 


NH (VW duy — VudV) = N g”. (dadyp — Iudag) 
= N#(g%B + en°NB) (dad — Iudap) 
= —§°F Oudap, (7.43) 


since gP do dup = 0 on dV and NEN“ NB (dadup — Judo) = 0 identically. Hence 


i F BPE" (V’duy — Vud¥) = —€ | : dy \/|det(z) |g"? N"“Oydap, (7.44) 


where we used (7.19), so that the last term in (7.35) cancels (7.42), as intended. 


. In view of these computations, we obtain for the variation of S(g) = Sg(g) + Sa(g): 


6S(g) = [ d’x /=g (Ruv — 1guvR)ögt”, (7.45) 


where we used (7.29). If there were no matter in the universe, then requiring S,(g) = 0 
for arbitrary variations duy (or 6g"”) therefore gives the vacuum Einstein equations 


It was a fact of great importance to Einstein that the gravitational action (7.2) is, as he called 
it, generally covariant, i.e., invariant under arbitrary coordinate transformations. See also §1.10. 
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We would now rather say that Sg(g) is invariant under (orientation-preserving) diffeomorphisms. 


This has a very interesting consequence.*!’ Consider variations of the metric that take the form 


gs = O58, (7.47) 


where Ọs is a one-parameter group of diffeomorphisms of M that preserve V, arising as the flow 
of a vector field X € X(M) having compact support within V (in which case X is complete). As 
a special case of (2.93), for the above variations (7.47) we have 


dgs 
(s = 0) = g, 7.48 
2 (s=0) X8 (7.48) 
and hence, using (3.73). 
Although this may seem obvious, we now explicitly show that 
Sc(Pp*g) = Sc(g). (7.50) 


Indeed, starting from Sg(g) = fy @gRg, where we have now explicitly indicated the g-dependence 
of œ and R, we obtain P*@, = We, and O*R, = Ro*g, So that 


Op*gRo*g = P*@;P"R; = Px (MeRg). (7.51) 


For any top-dimensional form æ € O”(M) (with compact support) one has 


| g*a = I a, (7.52) 
V V 
so the transformed action equals 


Sc(p*g) = [0 sRore = KACA = [eRe = Sc(g). (7.53) 


Therefore, for variations of the kind (7.47) we have S.,(g) = 0 for any metric g, that is, whether 
or not g solves the vacuum Einstein equations; the latter guarantee that S(,(g) = 0 under arbitrary 
variations of g, as opposed to the special ones (7.47). Using the Einstein tensor 


Guv — Ruy = 58uvR, (7.54) 


which like Ruy and guy is symmetric, from (7.50) we therefore have 
0=So(e) =- [VB (Vak + VoXu) 
= 2 | CHR, — 2 | dr EV u(GYYX) 
V V 
= 2 | NE (Vuh) Xy, (7.55) 
V 


since as in (7.45) the second term in the middle line is a boundary integral, which vanishes since 
X was assumed to have compact support within V. The final term must then vanish for arbitrary 
X. This recovers the (contracted) Bianchi identity, which holds, once again, for any metric g: 


VuGtY = 0. (7.56) 
This also follows from (4.25) and (7.54). Its impact on GR will be studied in §7.5. 


317 These also follow from Noether’s second theorem (cf. footnote 100), but are in fact easier to understand directly. 
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7.3 The energy-momentum tensor 


The left-hand side of the Einstein equation (7.1) describes the geometry of space-time. The right- 
hand side 7,,, (times 87), called the energy-momentum tensor, describes the matter content of 
the universe. The first thing one infers from (7.1) is that T € X (2.0) (M) has to satisfy 


Tuv = Tyu- (7.57) 


This makes index raising unambiguous, so that we may write T for either g”? Tov or g8”P Typ. 
Relative to a swarm of observers whose four-velocities u, normalized by (6.4), comprise a (local) 
timelike congruence (cf. §6.1), the energy-momentum four-vector of matter is Tu”, and 


E =T (u,u) = Tayu” u” (7.58) 


is the (relative) energy density of the matter. Similarly, one has a (covariant) momentum density 


PH = -hT uP, (7.59) 
cf. (6.11), which is orthogonal to u, i.e., g(P,u) = 0. The fully orthogonal projection of T, viz. 
Say = huhu Tios (7.60) 


is the stress tensor (of the given matter): if X and Y are spacelike unit vectors orthogonal to u, 
then S(X,Y ) is the force exerted by the matter in the direction X on the spacelike unit surface 
element normal to Y, and vice versa, since S(X,Y) = S(Y,X). This gives the decomposition 


Tuy = Suy + Puty + Pyuy + Euyuy. (7.61) 

Since the Einstein equations may be rewritten as 
Ruv = 8r(Tuv-4guvT), (7.62) 
where T = T = g" Tuv is the trace of T, it is often useful to know that, as implied by (7.61), 
T=S-E, (7.63) 


where S = gY Suy is purely spatial, i.e. S = Fi S(e;,e;) for some o.n.b. (e;) orthogonal to u. 
Assuming (7.1), the curvature condition (6.34) in Hawking’s Theorem 6.4 is equivalent to 


E>-S. (7.64) 

More generally, the most straightfoward energy conditions used in GR are the following: 
TuvX"Y" >0, X ~Y causal, (dominant energy condition = DEC), (1.65) 
TavXť“X” > 0, X causal, (weak energy condition = WEC); (7.66) 
TuvX" XY >AX#X,T, X causal, (strong energy condition = SEC); (7.67) 
TavX“X” > 0, X timelike, (null energy condition = NEC), (7.68) 


where X ~ Y denotes that X and Y, both causal, should be either both fd or both pd. One has 
obvious implications DEC > WEC => NEC and SEC > NEC, and DEC is equivalent to WEC 
plus the requirement that T} XY be causal for causal X. As such, it may be strengthened by 
the strengthened dominant energy condition = SDEC, which requires (7.66) plus the condition 
that TFX” be timelike for timelike X, provided Tav # 0. DEC will be used e.g. in black hole 
thermodynamics, cf. Proposition 10.38. Here is a completely different application of DEC:*!* 


318See Malament (2012), Prop. 2.5.1, Hawking & Ellis, §4.3, and, in final form, Minguzzi (2015b). 
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Proposition 7.1 Suppose a symmetric tensor Tyy satisfies DEC and the conservation law 
vi Tav = 0. (7.69) 
If S C M is an achronal set on which Tyy = 0, then Tyy also vanishes on D(S), cf. (5.170). 


If Tyy is “the” energy-momentum tensor, then (7.69) follows either from the Bianchi identity 
(7.56) and Einstein’s equation (7.1), or, if Tuy can be derived from an action principle, from an 


argument like the one at the end of $7.2. To see SDEC in action, we mention another difficult 


result, making Einstein’s idea that (7.1) implies geodesic motion of test particles rigorous:”!? 


Proposition 7.2 Suppose a symmetric tensor Tyy satisfies SDEC and and (7.69). Let c : I — M 
be a curve such that Tyy = 0 outside any nbhd of c(I) but Tyy(c(t)) 4 0 for some t € I. Then c 
can be reparametrized (if necessary) so as to become a timelike geodesic, cf. (3.48) 


The idea is that Tuy describes a point-like “test-particle”, which moves under the influence of 
gravity but does not act as a source. Note that the Einstein equations (7.1) are not even assumed! 
A much simpler result can be derived for so-called dust, with energy-momentum tensor 


Tuy = Puuuv, (7.70) 
where p € C*(M) is the mass density and u is as above, including (6.4). Eq. (7.69) gives 
Vi (put) -u+ pV yu = 0. (7.71) 
Since g(u, V,u) = 0 because of (7.69), contraction with u yields two independent conditions 
Vu(pu') =0; Vu=0, (7.72) 


of which the first is a conservation law and the second is just the geodesic equation for u. Eq. 
(7.70) is a special case of the energy-momentum tensor of a perfect fluid, which is given by 


Tuy = (E + p)unuv + Pguv = Euuuy + Phyy, (7.73) 


where the energy density € is related by the pressure density p through some equation of state, 
such as p = 0 (dust, as above) or p = +€ (ultrarelativistic fluid). Eq. (7.69) now gives 


(e +p)Vuu” +u(£) = 0; (e+ p)Vuu” +hHYo,p=0, (7.74) 
called the (relativistic) Euler equations. The quantities (7.58) - (7.60) are obviously given by 
E=$; P=0; Suv = Phuy, (7.75) 
so that S = 3p and T = 3p — €. The energy conditions then come down to (nontrivial exercise!): 
e SEC holds iff € + p > 0 and €+3p > 0; e WEC holds iff € + p > 0 and € > 0; 


e DEC and SDEC coincide in the case of (7.73) and both hold iff e > |p]. 


319 This idea goes back to Einstein & Gommer (1927) and Einstein, Infeld, & Hoffman (1938). For Proposition 7.2 
see Geroch & Jang (1975), as well as Geroch & Weatherall (2018) for further results. As in footnote 289, we refer 
to Curiel (2014a) and Martin-Moruno & Visser (2017) for more information about energy conditions. 
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Except for fluids,’”" most energy-momentum tensors are derived from an action principle, like 


the Einstein equations. The idea is that the “coupling” of gravity to matter is described by a 
functional Sy(g,), where @ stands for all matter fields, so that, analogously to (7.45), one has 


Sulg.) = =f, d’x /=8Tuvögt”, (7.76) 
where the prime has the same meaning as in $7.2 (varying the metric), or, as physicists write,” 
ôSm(8, P) 
Tav = FIT, (7.77) 
In this notation, the Einstein equation (7.1) then simply states that 
ô 
dgnv —— (Sc(g) + 1675y4(8,9)) =0. (7.78) 
This equation for the metric guy is to be supplemented with equations for the field(s), a ee 
dSu(g, 
u) o, (7.79) 
ôP 


The simplest example is a scalar field ¢ € C” (M), whose action functional is 


Su(s.9) = —4 | drv =e (gF avg + V(9)) =-4 | (s(Vo.Ve)+V(9)), (780) 


where V : R > R is a “potential” (which for a free field equals V (ọ) = um ~~). The computation 
(7.45), with Ruy replaced by du Pdy@ (so that there isn’t even a boundary term) gives 


Tuv = Wn — 384v (8 (VP. VP) +V(Q)). (7.81) 
Another case of interest is the electromagnetic field A € QO! (M), with F = dA € 0?(M), or 
Fuy = OyAy — OyAuı = VuAy — VvAu, (7.82) 


where the last equality follows because V is torsion-free. The (free) action is 
Su(g,A) = =-5 f d’x /=g gh? oY? FuyFoo = r: j! F, (7.83) 
with F? = F uvF*Y, from which a brief computation yields the energy-momentum tensor 
eS 4, (@°FupFvo — 1g, F*), (7.84) 


where the last term comes from the variation of ,/—g and the first one comes from ô (g? gF). 
For later use (see §§10.9—10.10), we note that (7.84) satisfies DEC, and hence certainly NEC. 


320Eyen for ideal fluids one has a (constrained) action principle due to A.H. Taub, but it is extremely contrived. 

321Tn order to obtain the correct Einstein equations one is, of course, free to vary prefactors and even signs in 
(7.77) and (7.78), but our choice matches the convention for Tuy in quantum field theory, with respect to which one 
should actually multiply Newton’s constant G with the factor 167 in (7.78) and with 87 in (7.1). 

322We might as well write these as 6(Sc(g) + Su(g,@))/5@ = 0, since Sg(g) is independent of @. 
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7.4 Electromagnetism: gauge invariance and constraints 


We start with electromagnetism, since it allows us to make an important conceptual point with 
regard to the Einstein equations. To make this point it is enough to work in Minkowski space- 
time, in which V, = Ou, A? = —Apo, A! = A; (i= 1,2,3), and the wave operator (d’ Alembertian) 


=o LA, (7.85) 
The equation of motion for Ay, obtained by varying Ay in (7.83) with flat metric guy = Nuv; iS 
OSu(g,A) OL OL 
= 0 = 0. 7.86 
ÔA, OAy “Ə(ðyAn) on 
For the specific action (7.83) this immediately yields 


which may, more intrinsically,**> be written in terms of the Hodge dual as d x F = 0 (the other 
half of the Maxwell equations is dF = 0, which however is automatic given F = dA). In 
parallel with the discussion in §7.2.7, the action (7.83) is gauge invariant, in that we have 
Su(A+dd) = Sm(A), say for all A € CX (V). Gauge invariance under Ay = JA yields 


0= I dtrö, FRA A= — f d*xhdydyF"" (7.88) 
V V 


for all A € CZ (V), which gives the Bianchi identity for electromagnetism, 0,,0yF’" = 0, i.e. 
uR! =0. (7.89) 


This is so obvious (in view of the antisymmetry of F) as to be disappointing, but it must be 
stressed that (7.89) is similar to (7.56) in being an identity, which holds irrespective of the 
equations of motion. See below for its thrust! Another consequence of gauge invariance is that 


the equations of motion (7.87) are simultaneously underdetermined and overdetermined: 
e They are underdetermined in that: if A solves (7.87), then so does A+ dA, A € C2(R*); 
e They are overdetermined in that the initial values are constrained (i.e. cannot be arbitrary). 


The first point is immediate from (7.87). For the second, we note that for u = 0 eq. (7.87) reads 


C=0;  C:= Ro = 0" Fyo = 0;Fi9 = Ay — O9(AyA”) = AAp—(V-A). (7.90) 


This is not an evolution equation but a constraint on the initial data A, (X) and Å „ (X) atx? =t =0, 
x = (x!,x?,x°). The fact that Ro does not contain second-order derivatives in time follows from 
the “Bianchi identity” (7.89), for if 0,R° equals some expression containing at most second-order 
derivatives in time, then R° = —Ro contains at most first-order derivatives in time. Since (7.89) 
follows from the gauge invariance of the action that causes the underdetermination, we see that 
under- and overdetermination of the field A,, are two sides of the same coin. Defining the electric 
field E; = Fio = djAo — oA; (i = 1,2,3), eq. (7.90) is just the Gauss law 


V-E=0. (7.91) 


323]n coordinates the Hodge dual of F is *Fuy = 5 gP gh ° EpauvFap, where € is the Levi-Civita tensor. 
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To address the undetermination of A, we pick a gauge condition, namely the Lorenz gauge?” 


G=0; G := A”. (7.92) 
In terms of this gauge condition, we also introduce the notation 
L. — 

Ry = Ru + dG = OA. (7.93) 

Without imposing any of (7.87), (7.90), or (7.92), the objects Ru, Ri; C, and G are related by 
Ġ= -C+ R$; (7.94) 
G=d0"Ri; (7.95) 
C = IR; = ART — 0;G) = V R“ — AG, (7.96) 


where (7.95) and (7.96) follow from the Bianchi identity (7.89): applying 0" to (7.93) and using 
(7.89) gives (7.95), whereas (7.96) is (7.89) itself, combined with (7.90) and (7.93). 

The point of all this is that instead of directly solving the awkward (i.e. underdetermined as 
well as overdetermined) Maxwell equations (7.87), one can first solve the wave equation 


Ru = 0; s Au =0, (7.97) 


which is of standard hyperbolic type: its solutions for given initial data are even known explicitly. 
There are two different ways to solve the equations (7.87) via (7.97), which both come down 
to the simple fact that the conjunction of (7.97) and (7.92) implies (7.87). But they differ in the 
distribution of labour between (7.97) and (7.92), as follows: 


e Covariant approach. Here we solve (7.97) for each u = 0,1,2,3 subject to initial data 
Au (X) and Åu (X) at t = 0 that respect both the constraint and the gauge condition: 


C(0,%) = AAo(X) — 0;A;() = 0; (7.98) 

G(0,x) = 0; Aj (X) — Ao(X) = 0. (7.99) 
To show that this can indeed be done, first take Ag(X) = Äo(&) = 0 (which, incidentally, 
solves (7.97) by Ao(x) = 0), so that (7.98) and (7.99) become 0;A; = 0 and 0;A; = 0, 
respectively. For example, take Ä;(X) = 0 but A;(X) Æ 0 arbitrary, and solve the elliptic 
equation AA = —0;A; for A. Replacing A; by A; + d;À then satisfies (7.99). Eqs. (7.97), 
(7.98), and (7.94) imply G(t = 0,%) = 0. Eqs. (7.95) and (7.97) then imply 


G=0. (7.100) 


With the initial conditions G(r = 0,X) = 0, this implies G(x) = 0 for all x € R* by the 
theory of the wave equation. This propagation of the gauge shows that (7.97) and (7.98) 
- (7.99) yield (7.87). The analogous propagation of the constraint C(t) = 0 is just a 
consistency check in this covariant approach: it follows from (7.94), since G = RE = 0, or 
from (7.96), which implies C(t) = 0 and, given (7.98), yields C(t) = 0 at all t. 


e Non-covariant approach. We solve (7.97) for each u = 1,2,3 only, as well as (7.98) 
at t = 0, but we now have to solve (7.92) for all t. By (7.96) this still gives C(t) = 0 
and hence C(t) = 0, so that (7.94) yields Ri = 0. We then have (7.97) for u = 0,1,2,3, 
and hence, given (7.92), once again have solved (7.87). Note that the propagation of the 
constraint is independent of the gauge: if R; = 0 whichever way, C(t) = 0 follows from 
the first equality in (7.96) with C(O) = 0, and hence from the Bianchi identity. 


34This gauge should be named after Ludvig Lorenz (1829-1891), rather than H.A. Lorentz (Kragh, 2016). 
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7.5 General relativity: diffeomorphism invariance and constraints 
To start, Einstein’s equations (7.1) have two key features analogous to Maxwell’s equations: 


e They are underdetermined: if g solves (7.1), then so does w*g, for any y € Diff(M). 


e They are overdetermined in that the initial values are constrained (i.e. cannot be arbitrary). 


As in the simpler case of electrodynamics, both properties have the same origin, namely the 

Bianchi identity, here (7.56), i.e. V'Gyy = 0. This identity follows from the diffeomorphism 

invariance of the action (7.2), and it shows that Goy contains at most first-order derivatives in 

time (since 0;Goy equals some expression containing at most second-order derivatives in time). 
The first point also follows from Einstein equations (7.1) themselves, which read 


Gg = 82T(g,9), (7.101) 


where G is the Einstein tensor (4.111). From (2.84) with y ~~ wl, (3.54), (4.10) and (4.34) 
we obtain Ry*g = W*R, (where we explicitly denote the dependence of the Riemann tensor R 
on the metric g), and similarly for the Ricci tensor, the Ricci scalar, and the Einstein tensor, i.e. 
Gy*g = W*Gg. Similarly, the energy-momentum tensor T (g,@) should be constructed such that 


T(y*8, WP) = Y*T (8,9), (7.102) 
and hence Einstein’s equation (7.101) for g implies 
Gy*g — 8AT (Wg, WP) = y“ (Gg — 87T (8,0)) = y*0 = 0. (7.103) 


In what follows we just discuss the vacuum case (T = 0), since the general case is similar.”> 
From (4.14), (4.15), and (4.109) we easily obtain, in any coordinate system, 


Ruy — 38° ° guv po = 1gP° (gpo,uv — Sov,up — gup,ov) +F(g, Og), (7.104) 


where F (g, dg) contains only first derivatives of the metric.*~° Anticipating a detailed discussion, 
we now point out that the ten (vacuum) Einstein equations Guy = 0 come in two groups: 


e The six dynamical equations G;; = 0, where i, j = 1,2,3 as usual, in which second-order 
time derivatives of the components of the metric occur; 


e The four constraints Cy := Gyo = 0, uU = 0,1,2,3, in which only first-order time deriva- 
tives of guy occur, so that these give relations between initial values for G;; = 0. 


As in (7.87), the first term in (7.104), which is essentially [lg „y, has a good PDE theory (as we 
will see, it makes the spatial components of g satisfy a hyperbolic evolution equation), but the 
other three terms, which are analogous to the second term in (7.87), ruin this and hence should 
be removed by a clever choice of coordinates that makes them disappear. The simplest way to do 
this (introduced by Choquet-Bruhat) is to use the wave gauge,””’ which given a metric Suv İS 


w"=0; Wee, (7.105) 


where the covariant D’Alembertian L, is defined, on any tensor, by 


= PV Vo. (7.106) 


325The discussion revolves around second derivatives of guv in the Einstein equation, which are absent in Tuy. 
326We will later see that in the relevant PDE theory only the highest derivatives of the unknown functions count. 
327 Coordinates satisfying (7.105) are called harmonic or wave coordinates. See Choquet-Bruhat (2009), §VI.7. 
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In the wave gauge as defined by condition (7.105), the coordinate functions x! are scalars,””® 


which, once again given a metric gyy(y), are found (locally) as functions x” (y) of some given 
coordinates (y") by solving Ox” = 0 subject to initial conditions on a spacelike hypersurface 
£: if (X) are given coordinates on È, and N is a fd normal vector field on £, we might impose 
conditions like xiy =f, xiz = 0, Nx! = 0, and Nx? = 1. In the new x"-coordinates, we have 


WH = POV dor = gP° (Apdo —Tpodv)x" = 8°" (dp55 -Ivor ) = —8°°Tho, (7.107) 


where gyy(x), and hence g” (x) and '5,(x), are obtained from guy(y) using the traditional 
change of coordinates formula (2.77). Hence (7.105) is a second-order PDE for the x". 

Given coordinates (2); on the other hand, the wave gauge (7.105) is seen as a condition on 
the metric guv (x), which because of (7.107) must satisfy any of the equivalent conditions 


PT, —0 = gP° (2gpu,o — 8p0,u) = 0, > dv ( —det(g)g"”) = 0, (7.108) 
cf. (7.16), e.g. with corresponding initial conditions goos = —1 and goj5 = 0. Using (4.15) gives 


SupAvw? + 8vpðu W’ = gP° (8po,uv — gov,up — Sup,ov) +H(g, Og), (7.109) 


where H(g,0g) has a similar meaning as F(g,0g). Therefore, the wave-gauged Ricci tensor 
cf. (7.97), takes a desirable quasi-linear hyperbolic form, starting with the D’ Alembertian: 


RY, = —48°°guvpo +1(g,08), (7.111) 


where again / contains only the metric and its first derivatives (though not necessarily linearly). 
From (7.110) we also define the reduced Einstein tensor 


Guy = A = Tigy = Guv + 1 (gupovWP + gvpduWP — guvooWP). (7.1 12) 
We then have the following six enlightening analogies between GR and electromagnetism: 


Suv © Au; w"oG Cu oC; (7.113) 
Ruy =0 Ru =0; Ry =0RG=0,  V4Guy =O dyR* =0. (7114 


Similarly to electromagnetism, there is no good theory for the (vacuum) Einstein equations 
Ruy = 0 we want to solve, whereas there is ample theory for the gauged Einstein equations 
Riy = 0 (though not as explicit and simple as for the wave equation LIA, = 0). In order to 
follow a similar strategy, we should also find analogues of (7.94) - (7.96). First, we have 


GW, = RY, — AguvR™ = Guv + 3(gupAvW? + 8ypðu W? — guvdpWP), (7.115) 
so that, taking v = 0, eq. (7.94) is replaced by four equations (u = 0, 1,2,3) 
1(guiW' + go0duW° + goiduW' — guodW') = —Cy + Gio. (7.116) 


328 As opposed to components of a 4-vector. Choquet-Bruhat even writes x) asa warning. 
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The analogue of (7.95) follows by applying the Bianchi identity (7.56) to (7.115). Using the fact 
that the WP are scalars and the Levi-Civita connection V is metric and torsion-free, we compute 


Hence eqs. (7.56) and (7.115) give 


W" =P V GN (7.117) 


Finally, the counterpart of (7.96) in GR again follows from the Bianchi identity (7.56), viz. 
ICH = HC; + (8P Tho + 8PM hy) Cu + 8? (ToC +ThoGjk); (7.118) 


9°C; = PO (T9 Ci +10 iCo) + 81S Cj — 3I Gij +P ToC t gPT Ge (7.119) 
where we may also write G;; in terms of Gi and W” via (7.115), e.g. for (7.118) this gives 


(9 +W*)Cy = gl ho (Cy — £ (Sup AW? + BvpAuW? — guvdpWP)). (7.120) 


[02 
Knowing all this, we can solve the vacuum Einstein equations Guy = 0 in two alternative ways: 
e Covariant approach. We solve the covariant (space-time) reduced Einstein equations 
w 
Ruy =0 (7.121) 


for all values u,v = 0,1,2,3. This can indeed be done, because (7.121) with (7.110) 
is a hyperbolic quasi-linear PDE system for which good existence, uniqueness, and sta- 
bility results exist; see §7.6. Since (7.121) gives ZRI = 0, it also implies Gh, =0. 
Furthermore, we impose both the constraints and the gauge conditions at t = 0, i.e., 


Ca (t = 0,¥) = 0; (7.122) 
w” (t =0,x) =0 (7.123) 

Then (7.116) also gives 
W#(t =0) =0, (7.124) 


upon which (7.117) gives W” (t) = 0 at all r, since this is the unique solution with 
initial data (7.123) and (7.124). This is the propagation of the gauge. The full Einstein 
equations Ruy = 0 are then satisfied because of (7.110), and finally-though unnecessary 
in this approach-propagation of the constraints may be verified from (7.118) - (7.119). 


e Non-covariant approach. We solve only the spatial part of the reduced Einstein equations 
Gi =0 (i,j = 1,2,3), (7.125) 


whilst imposing the initial value constraints (7.122) at t = 0, and the full gauge condition 
WE (x) = 0 for all x = (t,X). Since this gives GĦ, = Guy, the Bianchi identities (7.118) - 
(7.119) with G;; = 0 simply become 


3Co = (—9 + g/?T 9) Cj + (gP Too + 8 PT5o)Cu; (7.126) 
°C; = gP° (TI Ci +19 iCo) + °° Cj. (7.127) 


Thus the constraints satisfy coupled homogeneneous first-order hyperbolic PDEs, whose 
solution with given initial condition (7.122) is zero. This propagation of the constraints 
via the Bianchi identity gives the remaining Einstein equations Guo = 0, and since we 
already had G;; = 0 for i, j = 1,2,3 from Gi = 0 and W! = 0, we seem ready! 
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There is a complication, however, in that the reduced Einstein equations (7.125) are neither 


a priori in suitable (i.e. hyperbolic) form, nor do they follow from RY = 0 (which are in 
suitable form), because the Einstein tensor Guy also involves the Ricci scalar R, which cannot be 
computed from RY alone. This can be resolved by passing to the (inverse) densitized metric 


ge’ = gh”, /|det(g)|, (7.128) 


in terms of which the gauge condition and the gauged Einstein equations read 
dyg!“ = 0; (7.129) 
Gy = 0, (7.130) 
and moreover the gauged Einstein tensor (7.115) turns out to take the desired hyperbolic form 
|det(g)|Giy" = 59°? Ap dogt” +0(9.99). (7.131) 


We will not follow this path, but will set up the non-covariant approach in a more geometric way 
in §7.7. This will still follow the general idea of solving RY =0or Gh = 0 with initial value 
constraints (7.122), and a full (but non-covariant) gauge condition like W (x) = 0. 

Finally, since we have already treated electromagnetism as a warm-up for GR, it is interesting 
to combine the two in the light of gauge fixing and constraints. Hence we briefly study the 
coupled Einstein—Maxwell equations (7.62) with (7.84) and (7.79). Since T = 0, these become 

Ruv =e" FupFvo = gar (7.132) 

Ry := VY Fy =0. (7.133) 

Everything in $7.4 goes through, provided we replace ordinary derivatives by covariant ones, as 
in (7.133). For example, in the derivation of the Bianchi identity eq. (7.88) becomes 


j= f dry -g VF"! 3p À = -j dix yZgAV VF", (7.134) 
V V 
where V,, instead of 0, arises because of (7.17). Hence the Bianchi identity (7.89) becomes 
VaR” =0, (7.135) 


which unlike (7.89) is far from trivial. A simple computation using (4.13) and (4.109) yields 


where, in the spirit of what was just said, the covariant Lorenz gauge is now given by 


G=VyA". (7.137) 


Putting R = Ry + 0yG as before, we have Ri = Au — RuyA”. In order to solve (7.132) - 
(7.133), then, we must solve the gauge conditions G = 0 and W" = 0 and the hyperbolic system 


RW, = 21 2) ir): (7.138) 


Spelling out the “covariant” and “non-covariant” approaches is now just a tedious exercise. 
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7.6 Existence, uniqueness, and maximality of solutions 


In this section we give a geometric formulation of the Cauchy problem for the Einstein equations, 
including its (abstract) solution in the form of Theorem 7.10, obtained in 1969 by Choquet-Bruhat 
and Geroch (following two decades of progress mainly due to Choquet-Bruhat).°?? 

So far, the procedure in §7.5 leads to solutions that are local in space and local in time: 


e Locality in space follows from the use of specific coordinates, i.e. those satisfying (7.105). 


e Locality in time is all that the existence (and uniqueness and stability) theorems for 
quasi-linear second-order hyperbolic PDEs of the kind (7.121) provide. 


We now indicate how this can be improved. First, local existence in space turns into global 
existence in space by globalizing the gauge, as follows. A well-known concept in Riemannian 
geometry is that of a harmonic map h : M — M between Riemannian manifolds (M,g) and 
(M,$). These maps can be described abstractly, but is is easier to use local coordinates (x!) on 
M, and likewise (£) on M. Any map h : M — M has an associated energy functional, defined by 


En = fd xy/elaes(h): elh) = ee, (7.140) 


where h’ are the components of h relative to the coordinates (X). This expression turns out to be 
independent of the coordinates.??? For example, if M = [a,b] with flat metric, then E (f) is the 
energy (3.23) of a curve in N. Another example is N = R with flat metric, in which case 


E(h) = f yyh (7.141) 


is the Dirichlet integral of h, which plays a fundamental role in the theory of the Laplace 
equation Ah = 0 on M. It can be shown that h extremizes E(h) iff it solves the equation 


(et) — (7.142) 


OxHOAxY HV o xP Jk OxH ax’ 


where and Diy and f are the Christoffel symbols for g and g, respectively. Thus A is called 
harmonic if it solves (7.142). Exactly the same constructions work in Lorentizian geometry, in 
which case a solution of (7.142) is called a wave map. In that case, standard hyperbolic PDE 
theory yields existence and uniqueness of solutions Ajy, and hy. subject to initial conditions on a 
Cauchy surface & in M, which we (evidently) assume to be globally hyperbolic. 

In order to provide the right version of the wave gauge enabling global solutions in space, we 
pick some fiducial Riemannian metric y on our & and introduce the Lorentzian manifold 


M:=Rxz é:=—dt? + y. (7.143) 


Definition 7.3 We say that a (Lorentzian) metric g on M = R x È satisfies the S-wave gauge iff 
the identity map id: M —> M is a wave map with respect to g and $. 


329 Introductions to the Cauchy problem in GR, from different perspectives, include Choquet-Bruhat & York 
(1980), Friedrich & Rendall (2000), Klainerman & Nicolò (2003), Rendall (2005, 2008), Christodoulou (2008), 
Dafermos (2009), Choquet-Bruhat (2009), Ringström (2009), Chrusciel (2010), and Aretakis & Rodnianski (2015). 

330See e.g. Jost (2002), §8.1. 
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It follows from the coordinate-independence of (7.142) that this condition is coordinate-independent. 
One can also see this explicitly by noting that g satisfies the g-wave gauge iff 


wt =0 (7.144) 
for each u = 0,1,2,3, where, cf. (7.105) and (7.107), 
WH = gPY (I, Toy). (7.145) 


Since the difference between two connections is a tensor (see §7.2), the index is now a true 
vector index in that W" are the components of a vector. Thus the coordinate-dependence of the 
original wave gauge has been traded for g-dependence. We now follow the same steps as for 
the wave gauge, replacing W by W from (7.109) till the end of §7.5, with the same conclusions: 
the reduced Einstein equations are quasi-linear and hyperbolic, the gauge and the constraints 
propagate, etc., with the difference that none of the arguments now depend on the choice of 
local coordinates and hence local (coordinate) solutions can be patched together so as to become 
globally defined in space, that is, on X. In particular, we may treat the u in (7.145) as a vector 
index and write down neat covariant formulae. This gives, for example, 


Rav = Ruv + 1(VuW + VyWy) = —48° guvp0 + 1(8. 98); (7.146) 
n N w 
Wu + RYW, = V” Gay (7.147) 


cf. (7.110) - (7.112) and (7.117), which still have a desirable hyperbolic form. Mutatis mutandis, 
both the covariant and the non-covariant approaches of the previous section may then proceed. 

In order to (at least partially) overcome the problems with locality in time mentioned above 
we explain a specific way of posing the initial data that-within PDE theory-seems unique for 
GR. This construction not only brings the initial data in geometric form (as opposed to giving 
(guv(t = 0), ġuv(t = 0) ) as might expected for hyperbolic PDEs) but also solves the closely 
related problem that naively a solution (M, g) to the Einstein equations would be based on initial 
data given on some Cauchy surface & C M where M is given; but the problem in GR is that the 
manifold M is typically constructed along with the metric g, as opposed to be given in advance. 

To find the correct geometric way of posing the Cauchy problem for GR, we first assume we 
have a globally hyperbolic space-time (M,g) solving the Einstein equations (in vacuum or with 
matter), assume we have a spacelike Cauchy surface % C M, seen as a triple (M ‚2,1), where 
1:% M injects some given 3-manifold % into M (as an embedded submanifold, cf. Definition 
4.13), and figure out which initial data the triple (M,g,ı) puts on %. These initial data will then 
be taken by themselves, after which the ambient space-time (M,g) can be forgotten. 

As already mentioned, instead of guy and gyy at Z, one prefers geometric data, namely: 


e The induced Riemannian 3-metric g := 1*g, cf. (4.124); 
e The extrinsic curvature k of the embedding 1 : X — M, see (4.144). 


In the next section ($7.7) we will show that the Einstein equations impose constraints on these 
quantities (see also §7.5 for motivation and context), which in the vacuum case are 


R—Tr(k*) + Tr(k)? = 0; (7.148) 
V jk! — V iTr (k) = 0. (7.149) 
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Here R is the Ricci scalar on È for the Riemannian metric $ and likewise V is the Levi-Civita 
connection on & determined by g. Thus the initial data for the Einstein equations are triples 


(2, i, kij) = (22,8), (7.150) 
subject to the vacuum constraints (7.148) - (7.149), or their matter analogues (8.65) - (8.67). 


Definition 7.4 Given initial data (X,%,k) satisfying the constraints (7.148) - (7.149), any triple 
(M, g,1) that solves the Einstein equations and induces these initial data in such a way that ı(%) 
is a Cauchy surface in M, so that in particular (M, g) is globally hyperbolic, is called a Cauchy 
development or globally hyperbolic development of the data (2,8,k). 


The theorems below may then be summarized as follows: 


Theorem 7.5 Let (%,&) be a 3d Riemann manifold equipped with a second symmetric tensor 
ke x29 (x) 


such that (2,8,k) satisfies the constraints (7.148) - (7.149). Then there exists a maximal globally 
hyperbolic space-time (M,g) and an isometric embedding 1 : & — M for which the extrinsic 
curvature is the given k, and such a space-time is unique up to isometry. 


We will explain what ‘maximal’ means here. It is interesting to compare this with Theorem 4.18, 
i.e. the fundamental theorem for hypersurfaces, which for this purpose we rephrase as follows: 


Theorem 7.6 Let (%,&) be a connected and simply connected Riemann manifold equipped with 
a second symmetric tensor 


ke x29 (x) 
such that (X,&,k) satisfies the Gauss—Codazzi equations 
he a (7.151) 
Vik jx — V jkix = 0. (7.152) 


Ifm = dim(%) > 2, there exists an isometric embedding 1: & — R"*! for which the extrinsic 


curvature is the given tensor k, and such an embedding is unique up to Euclidean motions (i.e. 


up to isometries, which are combinations of translations and rotations). 


The constraints (7.151) - (7.152) are stronger than (7.148) - (7.149); up to a relative sign, which 
accounts for the difference between the Riemannian and the Lorentzian cases, see (4.148), eq. 
(7.148) follows from (7.151) by contracting it with Erz l whilst (7.149) follows from (4.155) 
by contracting with &*. The reason is that Theorem (4.18) asks for a stronger result, namely 
embedding into Euclidean space, where Roouv = 0, whereas Theorem 7.5 merely asks for 
embedding in a Lorentzian manifold where Ruy = 0. Otherwise, the spirit of the two theorems 
is similar, in that the Gauss—Codazzi equations and the constraints in GR, whose geometric 
form (7.148) - (7.149) will actually be derived from the Gauss—Codazzi equations, both arise as 
consistency conditions for the existence of a certain embedding of the initial data set (©, g,k): 


e into Euclidean space in the nineteenth-century fundamental theorem for hypersurfaces; 


e into a space-time solving the Einstein equations in the twentieth-century Theorem 7.5. 
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We will now dissect Theorem 7.5, in particular making precise (in steps) what it means for the 
space-time (M,g) to be maximal (this will be done in the crowning Theorem 7. ins 


Theorem 7.7 For any smooth initial data set (È, &,k) satisfying the constraints (7.148) - (7.149) 
there is an open interval 0 € I C R and a Lorentzian metric g on M = I x & such that (M FoF Ly, 
where 1:2 M given by ı(X) = (0,%), is a Cauchy development of (X,&,k). Moreover, (M,g) 
is automatically a globally hyperbolic space-time, with Cauchy surface 1(X). 


In §7.5 we have only sketched the part of the existence proof that reduces the Einstein equations 
to a simpler problem involving quasilinear hyperbolic PDEs, whose theory we briefly review in 
Appendix B; the entire proof can be found in the literature.*°* We now turn to uniqueness. 


Theorem 7.8 (Geometric uniqueness of solutions of Einstein’s equations) Any two Cauchy 
developments (M;, 81,11 ) and (M2, g2, 12) of the same (smooth) initial data are locally isometric, 
in that u (%) and u (%) have open neighbourhoods U, and U2 in Mı and Mo, respectively, such 
that (U\,g) and (U2, g2) are isometric through a diffeomorphism  : U, — U; satisfying 


W' go = 813 you =h. (7.153) 


The proof is very involved,” but the idea is the argument for underdeterminacy explained at the 
beginning of §7.5, where we require y to preserve the initial data (this is Hilbert’s version of 
Einstein’s Hole Argument, cf. $1.5). Technically, construct wave maps h; : M > M;i (i= 1,2), 
suitably shrunk to as to become diffeomorphisms, and define g; = h¥ g; on M. This brings both gı 
and g2 into the g-wave gauge. These new metrics solve the same equations, namely the reduced 
Einstein equations and the g-wave gauge condition, with the same initial conditions. Hence they 
must coincide by local uniqueness result from hyperbolic PDEs. From g = g} we then obtain 


go = (hy! oha)*g1 = yigı. (7.154) 


Definition 7.9 A maximal Cauchy development or (Mmax, &max; Imax) of given (smooth) initial 
data (%,8,k) satisfying the constraints (7.148) - (7.149) is a Cauchy development with the 
property that for any other Cauchy development (M, g,1) of these data there exists an embedding 


w:M — Mmax 
that preserves time orientation, metric, and Cauchy surface, i.e., one has 
W'8max = 8; Vio tae (7.155) 


Compare with (7.153). Since a maximal Cauchy development is always globally hyperbolic, it is 
also called a maximal globally hyperbolic development or MGHD of the initial data. 

The word “maximal” is confusing. It does not imply that (Mmax; gmax; Imax) is maximal as a 
solution to the vacuum Einstein equations with given initial data, or as a space-time. It does not 
even mean that (Mmax» gmax) cannot have any globally hyperbolic extensions. It does mean that: 


331We only discuss smooth initial data. See footnote 338 for the non-smooth case. As shown in Appendix B, the 
smooth case is proved from the case with initial data and thence solutions in Sobolev spaces H“ and letting s — ®. 

332See e.g. Choquet-Bruhat (2009), chapter VI and Appendix II, and in Ringström (2009), chapter 14. 

333See e.g. Choquet-Bruhat (2009), Theorem VI.8.4, or Ringström (2009), Theorem 14.3. 
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If (Mmax> gmax» Imax ) can be (properly) isometrically embedded in some space-time (M’,g'), then 
the ensuing copy of& in M' (arising from > M — M') cannot be a Cauchy surface in M'. 


In particular, & C M’ would have a nonempty Cauchy horizon. This is often taken to indicate an 
end to determinism, but this seems an overstatement. The correct statement is that the existence 
of a Cauchy horizon for È in the extension M’ means that (M er ), unlike (M ‚g), is no longer 
predictable from initial data on È. It may in principle be predictable from some new Cauchy 
surface % that is not the image of any Cauchy surface in M under the embedding, although in 
typical examples (cf. $10.6) the larger space-time (M’,g’) is in fact not globally hyperbolic.**4 
Strong cosmic censorship (in its current formulation, which is different from Penrose’s original 
one) excludes such extensions, and so we will take up this topic in more detail in $$10.4-10.5. 
We now come to the main abstract result in the initial-value approach to GR. 


Theorem 7.10 (Choquet-Bruhat and Geroch) Each smooth initial data set (X,&,k) satisfying 
the constraints has a maximal Cauchy development (Ma Smar, laaah which is unique up to 
time-orientation-preserving isometries fixing the Cauchy surface ı(%) C Mmax, as in (7.153). 


For understanding both the claim and its proof it is useful to rephrase Theorem 7.10 in terms of 
partially ordered sets (posets). We already saw that Cauchy developments of fixed initial data are 
far from unique due to diffeomorphism invariance of the Einstein equations. We circumvent this 
apparent lack of determinism by declaring two solutions equivalent if they can be transformed 
onto each other by a diffeomorphism respecting ı as well as time orientation. Thus we say that 


(Mı, 21,11) S (Mo, 82,12) (7.156) 


iff there is a time-orientation preserving diffeomorphism y : Mı — M, satisfying (7.153). This 
is an equivalence relation on the set GHD(%,$,k) of all globally hyperbolic (i.e. Cauchy) 
developments of the data (%,8,k). We denote the (quotient) set of its equivalence classes by 
[GHD] (=, g,). As usual, we write [M, g,1] for the equivalence class of (M,g,1). 


Definition 7.11 Initially, put 
(M1,81,11) < (Mo, 82, 2) (7.157) 


iff there is a embedding yw: Mı — M, such that (7.153) hold. This fails to be a partial ordering 
on GHD(Z, &,k) (it fails the antisymmetry axiom), but it does descend to a partial ordering on 
[GHD] (x, 2,4). By abuse of notation, provided (7.157) holds we may therefore write 


[Mi,21,t1] < [M2,82, 12]. (7.158) 
This makes ( [GHD] (£, &,k) <) a partially ordered set (poset). 


Recall that a top element T € P of a poset (P, <) is an element for which x < T for all x € P. 
A top element need not exist, but it is unique if it exists.°*> Theorem 7.10 then becomes: 


Theorem 7.12 The poset ({GHD](X,%,k) <) has a top element (which is necessarily unique). 


334See Doboszewski (2017, 2019, 2020) and Manchak (2011, 2017) for studies of the (in)extendibility of space- 
times, partly in connection global hyperbolicity. 

335 This is different from a maximal element m € P, where for all x € P one has m < x iff x = m. Maximal elements 
are often non-unique if they exist, and even if they are unique they may not be top elements (which are maximal). 
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Note that Theorem 7.12 implies that the maximal Cauchy development (Max &max» Imax ) is 
unique up to isometry.”’° We first rephrase Theorem 7.8 in terms of the above poset: 


Corollary 7.13 Any two Cauchy developments (M,,g1,1,) and (Mg, g2, 12) of given initial data 
have a common Cauchy development (M ae l), in that in that we have both orderings 


(M,g,ı) < (M1,81;t1); (M,g,1) < (Mp, 82,12). (7.159) 
Indeed, take M = Uj, with: 
e wf: M — M; given by the embedding i : U; C Mı; 
e YW: M — M, defined by y = yoi, where y is the map from Theorem 7.8. 


More strongly, we even have: 


Lemma 7.14 Any two Cauchy developments (M,,g1,1,) and (Mo, g9, 12) have a maximal com- 
mon Cauchy development (M",g’,1'), in that any other common Cauchy development satisfies 


(Men) Mer), (7.160) 
Indeed, if {Uq} is the set of all U,’s appearing in Theorem 7.8, i.e. Ug C Mı with given 
maps Wa : Ug — Mh, etc., then one may simply take the union M’ = U«U«, with the obvious 


embedding M’ C Mj, and the map y : M’ — Mp given by W(x) = W(x) if x € Ug. Conversely: 


Lemma 7.15 Any two Cauchy developments (M,,g1,1;) and (Mp, g2, 12) have a common exten- 
sion (M12,812, 112), in that we have both orderings 


(My, 21,41) < (Mi2, 812,12); (M2, 22,12) < (M12, 812,112). (7.161) 
Define an equivalence relation ~ on the disjoint union Mı U M3 of Mı and M2 by x ~ y if: 
e either x = y; 
e orx € M' CM, andy = y(x), where y : M’ — M, has just been defined. 


The quotient 
Mn = (Mı UM2)/ ~ (7.162) 


inherits a metric g12 from (M1, g1) and (M2, g2), as follows: 
e for x € Mı\M’ we put g12([x]) := g1 (x); 
e for y € M2\y(M') take gi2([y]) := g2(y), noting that [x] = x and [y] = y in those cases; 


e for x € Mı and y = y(x), so that [x] = [y], we put g12 (|x]) := g1 (x) (= g2(y)). 


336 Choquet-Bruhat & Geroch (1969) sketched a proof based on Zorn’s lemma, which they even had to use twice. 
The corresponding proof in Ringström (2009), §14, is wrong, but is corrected in Ringström (2013), $23. Instead, we 
outline a recent constructive proof due to Sbierski (2016), with some improvements by Wong (2013). 
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The obvious maps Mı <> M7 and Ma — Mj are isometries for g1> by construction.’ Similarly, 
we obtain embeddings & > M12 and % > M12 from the given ones % > M; and % Mp. 
The construction of the maximal space-time Mmax is an extension of (7.162). One defines 


Mmax = (U1 M3) / ~, (7.163) 


where {M, } is the set of all Cauchy developments (of the given initial data), and we identify 
x € Mı and y € M, (where 1 and 2 are generic values of A) iff x ~ y as defined after (7.162). 
Also, the constructions of the metric gmax, the embedding tmax, and the (isometric) embeddings 


Wa : M} — Mmax (7.164) 


are entirely similar to the case (7.162) just explained. Maximality is then obvious. 


Theorem 7.10 is stated for smooth initial data, which give rise to smooth 4-metrics. However, 
the local existence results whose proof we omitted are proved by taking limits of existence results 
for rougher initial data in Sobolev spaces H*(X) as s — œ (see Appendix B.3 for notation and 
details). These lower regularity results are also of interest as such. In particular, we have:??8 


Theorem 7.16 Let s > 3/2. For initial data (X, &;;,k;;) where & is sufficiently close to 7 and 


ge Hoo) (Z): k € Hi o) (2), (7.165) 


there is T > 0 such that the reduced vacuum Einstein equations (7.121) or their counterparts in 
a $-wave gauge, have a unique solution g on M = |0,T] x £, where 


guv € C((0,T],H**'(Z)) NC! T]HE): (7.166) 
Ap&uv € C(0,T],H°(2)). (7.167) 


This solution continuously depends on the initial data, in that &; — g in Ho, (£) and ki > kin 
Hi o) (£) imply gı > g in L*((0,T],H°*!(X)) as well as Opg; — g in L” ([0, T], H5 (£)). 


For s > m+ 3/2, the Sobolev embedding theorem (B.23) gives H*(X) C C’(X), so that for s > 
3/2 and hence m = 0, eq. (7.165) imply that g € C! (£) and k € C(X), upon which eqs. (7.166) 
- (7.167) then imply g € C! (M) and hence dg € C(M). Another refinement is localization. For 
example, Theorem 7.8 gives rise to what is best seen as a causality result: 


Proposition 7.17 Let (&ij,kij) and (8; ;,k;;) be (smooth) initial data on & that coincide on some 
submanifold Xo C È. Then any two Cauchy developments (|0,T] x £, g) and ([0,T'] x &,g’) of 
these data are isometric when restricted to D* (Xo) C |0,T”] x Xo, where T” = min{T,T’}. 


337The main difficulty in the proof is to show that Mj, is a Hausdorff space; see the references in footnote 336. 
338 Here fis a fiducial Riemannian metric on £ enabling a coordinate-independent definition of Sobolev spaces 
on È. The index (2,0) in H (2,0) (X) refers to the tensor character of g and k; one has such Sobolev spaces for any 
(k,l). Choquet-Bruhat’s original existence proof had s > 3/2 but (geometric) uniqueness required s > 5/2, see 
Choquet-Bruhat, Theorem 8.4, p. 168 (note that her s is our s — 1 so that our s > 5n is her s > 5n + 1, etc.). For 
s > 3/2, for existence and uniqueness, also in Theorem 7.8 and Theorem 7.10, see Chruściel (2014), Theorem 
1.1, or, using very different techniques, Fischer & Marsden (1979), Theorem 4.24. The world record is s = 1, i.e. 
ge H?”(Z) andk € H! (£) (Klainerman, Rodnianski, & Szeftel, 2015). 
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This does not follow from (the proof of) Theorem 7.8 alone (i.e. by reduction to a wave gauge). In 
addition, one needs a uniqueness (or causality) result for quasi-linear wave equations, which states 
that if two solutions have the same initial data on some submanifold Xo C %, then they coincide 
on the domain of dependence D* (Xp). See Appendix B.3 for some more background.”” 

In sum, a MGHD (M,g,ı) of initial data (X,,k) for the Einstein equations enjoys:**° 


1. Existence, with satisfactory regularity dictated by regularity of the initial data (X, 8,k). 

2. Maximality, at least within the realm of globally hyperbolic solutions. 

3. Uniqueness up to isometry, in the precise sense stated in Theorem 7.10. 

4. Causal propagation, in that initial data at Xp C È determine the solution within Dt (Xo). 

5. Cauchy stability, in that the 4-metric g continuously depends on the initial data (2, gk). 
These features of the initial-value approach to GR have given rise to an ideology in which: 

e All valid assumptions about GR are assumptions about initial data (È, gk). 

e All valid questions in GR are questions about the MGHD (M, 8,1) of these data. 


This PDE-based program has so far had spectacular successes.**! It sometimes gives a slightly 
different perspective from the (Penrosian) mathematical approach to GR originating in the 1960s, 
in which typically larger (e.g. analytically extended) space-times are studied. See $$10.4-10.5. 
In fact, even the PDE results stated above should be seen as “classical” in the somewhat 
different sense that they used spacelike Cauchy surfaces. Since the 1990s, much progress in the 
initial-value approach to GR has been made by giving initial data on certain null hypersurfaces, 
which lead to a characteristic initial value problem.’ The idea of solving PDEs through 
characteristics originally came from first-order PDEs.”"” The simplest version is the PDE 


Id, (7.168) 


where X € X(M). This is solved by any f € C”(M) that is constant along the integral curves 
(i.e. flow) of the vector field X, which in this context are called the characteristics of the PDE. 
Thus the PDE is effectively replaced by an ODE, namely integrating X. In the usual Cauchy 
problem, one fixes a solution f by prescribing its value on a non-characteristic (Cauchy) surface 
= C M, in the sense that the characteristics are nowhere tangent to & (otherwise, one may have 
constraints on the initial data and have both an under- and overdetermined problem). 


33°For a more detailed treatment cf. Choquet-Bruhat (2009), Appendix III, Theorem 2.15. 

340These points are developed in far greater detail in Choquet-Bruhat (2009) and Ringström (2009, 2013). 

341 These started with the proof of stability of Minkowski space-time under small perturbations of the initial data 
(Christodoulou & Klainerman, 1993), and at the time this book went to press culminated in analogous stability 
results for the Schwarzschild metric (Dafermos, Holzegel, Rodnianski, & Taylor, 2021) and the slowly rotating Kerr 
metric (Hafner, Hintz, & Vasy, 2019; Klainerman & Szeftel, 2021). See also the references in footnote 297. 

342For GR this goes back to Penrose (1963), written in 1961 and republished in 1980, and Bondi and Sachs (see 
references in Chrusciel & Paetz, 2012). In Penrose’s spinorial approach (see also Penrose & Rindler, 1984, and 
more briefly Stewart, 1991) there are no constraints at all. It was further developed by Friedrich (1979), and, for 
numerical relativity, by Stewart & Friedrich (1982) and Friedrich & Stewart (1983). Existence theorems go back to 
Rendall (1990) and were later improved by Luk (2012). See also Christodoulou & Klainerman (1993), Klainerman 
& Nicolo (2003a), Christodoulou (2008), Choquet-Bruhat, Chrusciel, & Martin-Garcia (2011), and Aretakis (2013). 

343For the classical theory see e.g. Courant & Hilbert (1962) or Rauch (2012). 
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The general idea of solving or at least simplifying some PDE by solving an associated 
“characteristic” ODE also works for certain second-order hyperbolic PDEs. The simplest example 
is the wave equation (-9 + 07) f = 0 ind = 2. With u = t — x and v = t + x, this is solved by 
f (u,v) = g(u) + h(v). In other words, any function that is constant along either all characteristics 
u = constant or along all characteristics v = constant is a solution. In the usual Cauchy problem 
one gives initial data f(0,x) and f(0,x) at t = 0, or on more general spacelike Cauchy surfaces 
Ł, since as long as È is spacelike the characteristics are nowhere tangent to it. However, in this 
case it is perfectly reasonable, and perhaps even more natural, to prescribe initial data on some 
fixed characteristic u = constant together with a fixed characteristic v = constant. For example, 
one may take the lightcone (u = 0) U (v = 0), which obviously fixes both g and h, and hence f. 
This also works locally, in the sense that we may take two finitely extended fd lightlike lines N] 
and N, that emanate from the same point (or, from a different point of view, would intersect at 
that point), forming a “V” (the apex is not supposed to be part of either N] or N2). In that case, 
prescribing f on Nj UN; fixes the solution (still to the 2d wave equation) at least on the future 
domain of dependence D+ (Nj U N2), as one easily verifies from a picture. 

This 2d setting has two different generalizations to d = 3 or 4. One may specify initial data: 


e either on an open (truncated or semi-infinite) fd null cone emanating from its apex;”** 


e or on two bounded open null hypersurfaces Nj = C, M = C as described in $6.3. 


We briefly summarize the latter scheme, which is more popular than the former. In d = 4, or in 
d = 3, where the two-sphere S? is replaced by the circle $ ' C(C)is anull hypersurface that is: (1) 
bounded in the past by a spacelike sphere; (ii) generated by the fd lightlike geodesics integrating 
the lightlike vector field L (L), and (iii) foliated by two-spheres S, (S,), see (6.61), (6.81), for 
some range 0 < t < ty (0 < t < tp) for which C (C) is smooth. If (x!,x?) are coordinates on S?, 
then (x!,x?,r) and (x!,x?,t) are coordinates on N; and N», respectively. 

In the wave gauge (7.105), suitable “characteristic” initial data on N; UNS for the Einstein 
equations in vacuum (as well as for certain matter sources, including electromagnetism) are 
provided by a family t ++ £, of 2d Riemannian metrics on the spheres S, foliating C, plus a family 
t+» k; of covariant symmetric 2-tensors playing the role of (6.73), i.e. of the “null extrinsic 
curvature”, and similar data on M2. These are supplemented by a scalar function and a 1-form on 
S*. The first of these will be initial value for the 8u component of the 4d metric g, whilst the 
second is an initial value for what is called the “torsion” X +> ¢ (X) := g(VxL,L). 

These initial data are constrained in a very different way from the spacelike case. Apart from 
certain continuity and compatibility requirements, the key constraint on the tensors g; and k, on 
N is given by the null Raychaudhuri equation (6.98), in which @ is defined by (6.78), and by a 
similar equation for the initial data on M2. These Raychaudhuri equations are ODEs (as opposed 
to the elliptic PDEs in the usual approach), which is of course a major simplification. In fact, as 
in the spacelike case (cf. $8.6), but with very different details, unconstrained initial data may 
be given using conformal methods. In particular, the unconstrained metric data are conformal 
equivalence classes of such families of 2d Riemannian metrics on the spheres S; and $. 

This leads to a counterpart to Theorem 7.7, i.e. one locally obtains globally hyperbolic 
solutions from such initial data, which are unique in appropriate coordinates.**> However, the 
analogue of a coordinate-free Cauchy development of the initial data remains to be formulated 
precisely. A fortiori, a “characteristic” version of Theorem 7.10 is still waiting to be proved. 


344 See Choquet-Bruhat, Chruściel, & Martin-Garcia (2011) for this. 
345 See Luk (2012), who extended the region in which Rendall (1990) proved the existence of solutions. 
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7.7 Geometric form of the constraints 


In this section we pay our debt by (twice!) deriving the (vacuum) constraints (7.148) - (7.149); 
matter sources give additional terms in the constraints, see (8.65) - (8.67). Thus we assume a 
spacelike hypersurface & C M of a space-time (M, g) that solves the vacuum Einstein equations 
(it is not necessary for this derivation that & be a Cauchy surface). The constraints are geometric 
and hence coordinate-independent, but their derivation is most easily done in coordinates (x!) 
where er) are coordinates on È}, goo = —1, and go; = 0. Such coordinates always exist 


locally, see Proposition 8.1 in §8.1. In such coordinates, the fd unit normal to % is simply 
N! = œ = (1,0,0,0); Nu = (-1,0,0,0). (7.169) 
In such coordinates, the Gauss relation (4.148) reads, with spatial indices i, 7,k,/ = 1,2,3, 
Ri jer = Rije + Kirk jt — Kak jk- (7.170) 
Contracting this to the spatial Ricci tensor Rjj = g"” Ryiv; and Ricci scalar R = g"’R,v gives 


Rij + Roioj = Rij + Tr (Kj — Gs (7.171) 


R-+2Roo = R+Tr(k)? —Tr(k), (7.172) 


so that (7.148) is precisely the geometric form of the so-called Hamiltonian constraint 
Goo := Roo — +Z00R = Roo + ¿R = 0. (7.173) 
In the same sprit, in our coordinate system Codazzi’s equation (4.149) system comes down to 
Roijk = Vik jn — V jki. (7.174) 
Contracting to the Ricci tensor gives 
Roi = 8" Ruovi = 8”R jori = 8"*Roijx = Tt (k) — vki. (7.175) 
Contracting to the Ricci scalar is unnecessary, since the momentum constraint is simply 
Goi := Roi — 380iR = Roi = 0, (7.176) 


so that (7.149) follows from (7.175). Note that 0;Tr (k) = V;Tr(k), as Tr (k) is a scalar. 
We now also present a coordinate-free proof of (7.148) - (7.149), via a 4d-version of the 
3d-objects g and k defined on %. These are given in any coordinates by 


Suv ’= Suv + NuNv; (7.177) 
kuv = E80 V pNo- (7.178) 


See (6.11) - (6.11).°*° Note that indices are raised and lowered with g, so that 


&u = Oy +NN”. (7.179) 


346Note the minus sign in (7.178) compared to (6.12), which is a consequence of different conventions in fluid 
mechanics and differential geometry. Many physics texts have a plus in (7.178). 
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This tensor is also called hy. Taken at x € M, it is the matrix of the orthogonal projection operator 
(4.135). Unlike the original ge X0 (2), the new g € ¥(2)(M) is defined on any pair of 
vectors X,Y € T,M (x € %), though the extension is somewhat trivial in that $(X,N) = 0 for 
any X,Y € T,M, whilst g(X,Y) defined from (7.177) equals the original (X,Y) defined from 


(4.124). Hence the ambiguous notation is admissible and it is always clear which g is meant. 


Likewise for k in (7.178). In terms of the projection Ty, for all x € & and X,Y € X(M), 


&x(X,Y) = g(m(X), T™(Y)); (7.180) 
Ki Shay). (7.181) 


The Gauss-Codazzi identities (4.148) - (4.149) are now rewritten as 
fap ës Rpouv = Rysap + kyaksp — kypkas; (7.182) 
gup NI Rpouv = Vekay—Vakpy. (7.183) 
The corresponding contracted Gauss relations easily follow from (7.182), and are given by 


Fue pRuv + ag N" N? Rpouv = Rap + Tr (K)kap — kapg: (7.184) 
R+2NUNYRyy = R + Tr (k)? — Tr (F°), (7.185) 


where we used the following identity and notations: 


Br = BPP =P ANON": (7.186) 
as er GaSe hav (7.187) 


If we now write the Hamiltonian constraint Goo = 0 in pseudo-covariant form as 


it is clear that (7.185) and (7.189) reproduce (7.148). 
Similarly, the contracted Codazzi relations (which stop at one stage) follow from (7.183) as 


N" EY Ruy = da Tr (k) — V pkh: (7.190) 
The momentum constraint Gig = 0 is now written pseudo-covariantly as 


since guyN” gy = 0. With (7.190), this recovers (7.149), and we are ready. See also §8.3. 

The Gauss relation (4.148), or, equivalently, (7.170) or (7.182), describes the value of the 
Riemann tensor R(W,Z,X,Y ) at four spatial vectors (W,Z,X,Y), whereas the Codazzi relation 
(4.149), or (7.174) or (7.183), gives its value R(N,Z,X,Y ) at three spatial directions X,Y,Z and 
one timelike direction N. For the dynamical (evolution) equations G;; = 0 we will also need the 
case R(W,N,X,N) of two spatial and two orthogonal timelike vectors; unlike the previous two 
cases, which just rely on the embedding % C M, this new case will contain derivatives of 2; 
and kj in the orthogonal (temporal) direction, i.e., in suitable coordinates, 0,8; j and O;kiz. This 
requires not just a single Cauchy surface & C M, but a foliation M = [,%;. This is the subject of 
the next chapter; the required identity will be (8.37), or, equivalently, (8.38) or (8.39). 


174 


175 


8 The 3+1 split of space-time 


In this chapter we develop the non-covariant approach of §7.5 through a split of space-time 
into space and time.” Philosophers would say that this split relates the “scientific” image of 
GR to its “manifest image”, since what we experience is space and time separately, rather than 
Minkowski’s (and subsequently also Einstein’s) lofty notion of space-time. The 3 + 1 split is the 
key to e.g. the Hamiltonian approach to GR discussed in §8.7, as well as to numerical relativity. 


8.1 Lapse and shift 


In the previous section we described the constraints Guo = 0 in 3 + 1 split geometric form (7.148) 
- (7.149). These constraints do not contain time derivatives of g;; and ki j, whose time-evolution 
is governed by the spatial Einstein equations G;; = 0. To rewrite these in 3 + 1 form it is not 
enough to have a single Cauchy surface & C M; we need to assume a foliation 


of M by spacelike Cauchy surfaces %,. In particular, we assume that (M, g) is globally hyperbolic. 


The choice of a foliation may be compared with a choice of gauge in the covariant approach 
in §7.5, like the wave gauge (7.105), whose goal it is to single out a unique metric solving the 
Einstein equations within its equivalence class under diffeomorphisms. A foliation by spacelike 
hypersurfaces is a choice of a “now” at each instant of time; it is hallmark of GR that such a 
choice is arbitrary (as long as each %, is spacelike). See §1.10 and §8.11. As explained in §7.5, 
given such gauge fixing on all of M, one only needs to solve the spatial Einstein equations 


Gij = 0, and impose Gyo = 0, i.e. (7.148) - (7.149), as constraints on the initial value surface X. 


See §8.3. In the light of Theorem 5.44, such a foliation is equivalent to a diffeomorphism 


F:Rx£ >M (8.2) 

with the property that each subspace %, := F;(X) is spacelike. With t € R and x € È, we write 
F,: LM; R(x) := F (t,x); (8.3) 

F; : RM; F(t) := F (t,x), (8.4) 


which shows the double role of foliations: for fixed time r € R the map F; is a spacelike 
embedding of & in M, whereas for fixed x € È the map F, is a curve through F(0,x) € M. A 
priori defined by F‘, such a foliation (8.1) is also equivalent to one of the following structures: 


e A temporal function t : M — R with g(Vt, Vt) < 0, cf. Definition 5.42 and Theorem 5.44. 


e A function called the lapse L and a vector field called the shift S of the foliation.*** 


The lapse and shift may be defined in a coordinate-independent way by the decomposition 


dF; 
—=:IN +S, 8.5 
i + (8.5) 


seen as an equality between vectors in 7,M for any y = F (x,t), as follows: 


347 The3+1 split originated in the work of Darmois (1927), Lichnerowicz (1939, 1955), and Fourés-Bruhat 
(1956); see Choquet-Bruhat (2018). It subsequently crossed the independent development of the Hamiltonian 
formalism for GR, in particular through the work of Arnowitt, Deser, & Misner (1962). See footnote 384 in $8.7. 

348 One may wonder why something can simultaneously be determined by one function t and by four functions L 
and 5, but the metric information in the former is in the four components of the vector field Vt. 

349 Many authors write (8.5) as 0; = Nn +N , where N is the lapse, n is the normal, and N is the shift. 
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e the left-hand side is the tangent vector at y to (e.g.) the curve c(s) = F,(t+s) ats = 0; 
e LE C”(M) is a scalar whilst N € X(M) is the normal future-directed vector field to %;; 
e SE X(M) is the orthogonal projection of dF, / dt onto T,X, C TAM, hence tangent to 2. 


Here we assume (4.131). Thus, given the metric g on M, a foliation F of M by spacelike Cauchy 
surfaces uniquely defines L and S. Conversely, the idea is that L and S fix a foliation (8.1), but 
not all pairs (L,S) do so, not even if L > 0. Starting from a Cauchy surface & C M it turns out 
that one may always globally put S = 0, see (8.14) below and Theorem 5.44 (although this may 
not be the wisest choice). In addition, one may locally set L = 1 (see Proposition 8.1 below), but 
the latter is generally not possible globally: if S = 0 and L = 1, then the flow lines of N would 
be (pre)geodesics, whose focusing and hence crossing (in the presence of positive curvature) 
would invalidate the foliation. There might be similar problems with other choices of L and S. 
From the point of view of a temporal function t, the lapse and shift are given by 


1 
v-g(Vt,Vt) 


We can choose coordinates (x°,x',x?,x°) adapted to the foliation (8.1), as follows: 


N=-LVt. (8.6) 


e x’ =t, or, more precisely, x° (x) = t, provided x € %,; 


° (x! ) are (local) coordinates initially on È (i = 1,2,3), but subsequently on any slice £x: if 


y € %,, the flow line of the vector field Vt (or N) hits X in exactly one point x9 € &; if the 

latter has coordinates x9 = (0,x!,x”,x*), the former has coordinates y = (t,x',x?,x°). 
Given (local) spatial coordinates (x',x?,x°) on £, at any point x € £; one has tangent vectors 
ei = 0; to X£, as well as a one-form 0° = dt. As we have seen, dọ may not be orthogonal to I, 


and hence to the vectors e;, but the shift S = $0; := Y3_, S‘d; corrects for this, in that the vector 
eo = a-S (8.7) 

is orthogonal to £. We then have a frame (e4) with dual coframe (0°), defined by 
eo := d; —S'0;; ei := d; (8.8) 
Dei: 6! := dx + S'dt, (8.9) 


where g(eo,e;) = 0 and g(0?,0Ż) = 0, and, by definition, 0° (ep) = 6; for a,b = 0,1,2,3. 
By definition of the lapse and the shift, we then have the useful relations 


g = -1?(0°)? + 3,;0'0°; eo = LN = -L’Vt; (8.10) 
ide Vt = gH Ou: (8.11) 

L=1/V-8%; 80/8; (8.12) 

Ny = (-L,0,0,0); N# = (1/L,-S/L). (8.13) 


Consequently, in coordinates adapted to the foliation, the metric and its inverse take the form 


= —L? + $ Si Sj ; uv —1/I? Si / L2 
En ( Si 8ij | 8 | S/E gi Sisi e j’ (8.14) 
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where $£ is the matrix inverse to 3; j and spatial indices are raised and lowered with this spatial 
metric (so that e.g. S;S7 = 9;;S'S/). Thus L and S' may also be seen as parametrizations of the 
non-spatial components of the metric. The local possibilities are as follows:*”° 


Proposition 8.1 For any Cauchy surface } C M with 3-metric gj; and extrinsic curvature ki jin 
given coordinates (x"), there exist coordinates (y") in which $;j and kij are the same, whilst 


eh | (8.15) 


on È. Moreover, in a nbhd of X one can give the components gou any desired value. 


We now expand the pseudo-covariant notation (7.177) - (7.178), originally defined on & C M, to 
all of M, assuming (8.1) and the ensuing extension of the normal vector field from È to M. Then 


kuv := —VuNy = kuy + NuAv, (8.16) 
where the acceleration A of the vector field N is defined by 
A=VNN; AY = NVN". (8.17) 
We now shed interesting new light on the extrinsic curvature k of % C M by showing that 


k=-1 A (8.18) 
Dr (8.19) 


seen as equalities between symmetric tensors in either ¥(2°) (2) or X0) (M); in the former 
case the proof of (8.18) in fact implies that Zyg€ X (2,0) (2). In arbitrary coordinates, we have 


kuv = —} ZN Suv, (8.20) 
SI Bw (8.21) 


In coordinates (t,x') we may restrict to spatial indices: using (8.8) and (2.94), eq. (8.21) is 
(0, — Le) 8ij = —2Lkij, (8.22) 
which in coordinates where also L = 1 and S = 0 further simplifies to the transparent equality 
kij = —40,8ij. (8.23) 
Before embarking on the the derivation of (8.18) - (8.19), note that (8.23) is easy to derive: 
kij = —ViNj = —O:Nj + TH Nu = T}; = 4 W981 = — 40,8, (8.24) 


since in coordinates where L = 1 and S = 0 we have (8.15) and hence (7.169), cf. (8.14). 
To derive (8.18), we first use the (1,0) case of (3.72) with X = N to compute 


350This proposition is slightly adapted from Chrusciel (2010), Proposition 1.4.1, which is also proved there. 
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since in the second term (V uN Y)N, vanishes because of (4.131), which gives 


Using this as well as (8.16), the (2,0) case of (3.72) with X = N then gives 


From (7.177), (3.73), (8.16), and (8.27) we then obtain, at last, 
Ly Suv = Ly(Suvt+NuNy) = —2kuv — NuAv —NyAp + NuAy + NyAu 
= —2kyy. (8.28) 
We derive (8.19) from (8.18) using a general fact, namely, using (8.17), 
Au = Oy (In L) = Lg) dyL, (8.29) 


where we use the notation 0, = ae Oy for the derivative along &.”>! Note that the projection gh 
reconfirms that A is tangent to È} (i.e., orthogonal to N), which we already knew because 


g(N,VnN)=0. (8.30) 
Using (8.16) and (7.177), eq. (8.29) is equivalent to 
VnNy = N#V,Ny= LT! (NHNydu + Oy)L, (8.31) 


which we will now prove. The proof relies on torsion-freeness of V, which implies Voy f = 
VvOuf for any f € C”(M). We write (8.13) as Ny = —Ldyt and compute 


NYV N, = —N" V u (Loyt) 
= —N" (dyLoyt + LV yout) 
=L N" Ngu LIN" V L Ny) 
= L'N" Nyð L — N” N ðyL' — LN" V N, 
= L~!(NYNy Oy + ðy)L, (8.32) 
where we used (8.26). Using (8.10), (2.94), and (8.18), we then compute 
Loy Suv = Zın&uv 

= L£y&uv + NudyL+NyduL+ (OpL)NPNpNy + (0yL)NP NoN, 

= L£n&uv + NudvL + NyduL— NyduL— Nu yL 

= —2Lkyy. (8.33) 


This exemplifies a general phenomenon concerning £,,: if any tensor T € X (k0) (M) satisfies 
T(X1,... Me) = TM goes A) ) (8.34) 

i.e., T is purely spatial, or, equivalently T(X1,...,X.) = 0 if X; = N for at least one i, then also 
Zossen), (8.35) 


that is, also £,,7 is purely spatial. This most easily follows from the Leibniz rule for £ and 
hence the case k = 1. Since eo = LN we may as well derive (£.,T) (eo) = 0 from the assumption 
Te) (eo) = 0: using (2.94) and Z,,eo = [eo,eo] = 0, we obtain 


(LT) (€0) = eo(T(eo)) + T(Le,€0) =0+0=0. (8.36) 


351 This is consistent with notation V for the covariant derivative within £ defined with respect to & because of 
(4.136), which in coordinates reads enV vy? = VYP. 
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8.2 Beyond Gauss-Codazzi: The Darmois identity 


As promised at the end of §7.7, we now derive an identity for Riem(W,N,X,N), the Riemann 
tensor at two spatial and two orthogonal timelike vectors. This is the final identity in a chain: 


e the first such identity was (4.147) - (4.148), with zero entries of N (due to Gauss); 
e the second was (4.149), with one slot occupied by N (due to Codazzi); 
e the third will be (8.37) below, involving two copies of N (due to Darmois). 


More N’s are fruitless, as the Riemann tensor vanishes due to its (anti)symmetries. This new case 
will contain expressions like Lak, which unlike terms like V;k,z in (4.149) involves derivatives 
in the orthogonal direction. Thus the case of two orthogonal vectors relies on the time function, 
or, equivalently, on the foliation (8.1) (at least near & = Xo). The Darmois identity, then, reads 


Riem(W,N,X,N) =L7!(.Y,k(X,W) + VwVxL) + (X,W), (8.37) 
where X,W € Tx. In general coordinates, this expression reads 
EREBNONYRoouv=L" (Legkap + VaV gL) + Kap, (8.38) 


where ke = kap ke , in which the indices on k are raised and lowered with either & or g (this 


does not matter because any action of the terms NN, in (7.177) contracts to zero on k), and 
VpL = OpL. In coordinates (t, x’) with zero shift and unit lapse, as before, eq. (8.37) is simply 


Rojo = Okij + Ki. (8.39) 
To see this, eq. (4.13) gives Riojo = (VjVo — VoV ;)Ni. Eqs. (8.15) and (7.169) then give 
V ;VoN; = 0jVoN; — TY; VvN; — TY VoNy = Ti, VEN; — T? VoNo = T$ kri — T? T80 = -K; 


ij? 
—VoV jN; = Voki; = Ook: = Toki; = To jki = oki; + kiki; + kikii = Oki; + 2k;,, 


since VoN; = —T'h,Nu =T8, =0, T$; 1 ok! dog i = kh from (8.23), T9} = 0, and do = dı. 
To derive the coordinate-free version (8.38), we first note that (8.16) and (8.29) give 


VuNy= —kuv—Nud,(inL). (8.40) 
As in the derivation of the Gauss-Codazzi equations, we start from (4.13), this time withZ=N: 


RöuvN® = (VuVv- VyVu)NP = —V u (kp +N? L) + Vy (ki + NyoPL) 
= Vyki — V ukh + (VyNu — VuNv)OPL+ (NuVv—NvVu)oPL. (8.41) 


This gives 
NONYRpouv = Vnkou —N’Vpkpy + ðu (InL)dp(InL) + VydpL+NuVnopl, (8.42) 


whose last term will vanish upon contraction with 7 in (8.38). We rewrite the second term 
N’V ukpv using the fact that N’kpy = 0 and hence also Vy(N’kpy) = 0. This gives 


-NV ukh = EV N’ = kl — NO" (In L), (8.43) 
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whose last term will disappear upon contraction with 85 in (8.38). We now replace the covariant 
derivative in the first term V nkp u by a Lie derivative. Our favorite rule (3.72) gives 


Legkpy = Veokpu + (V ueg )kpv + (V peð )kuv, (8.44) 
in right-hand side of which we substitute eọ = LN, and hence 
Ve = LVN. (8.45) 


Recall that unlike the Lie derivative -%y, the covariant derivative Vy is C” (M ) -linear in X. In 
the remaining terms we use (8.40). Many of the ensuing terms drop out after contraction with 
ah 85 , and after a lengthy but straightforward computation we obtain 


80.85 kon =L'Vegkap + Kap: (8.46) 
Using (8.44) and (8.46) in (8.42) finally gives (8.37), as follows: 
uB NIN” Rpouv =L | Lokap + Kap — kag + Aa (InL) dg (InL) + Vadgl 


-1 7 g 72 
=f (Lekap + VaVgL) + kap: (8.47) 
For the Einstein equations we do not need the full Riemann tensor Rpopy but its contractions 
Ruy = Roy = gP°Rouov; (8.48) 
R = gt Ruy, (8.49) 


defining the Ricci tensor) and Ricci scalar, respectively. For later use we therefore compute the 
contractions of (8.38), which are slightly involved. First, (7.184) and (8.38) give 


Rap + Tr (K)kap — kop — fap Ruv = e A + VaV pL) + kop: (8.50) 
from which we obtain 
Ra8pRuv = —L (Lekap + VaV pL) + Rap + Tr (k)kug — 2Kop- (8.51) 
Contracting both sides with eB, and defining the 3d covariant Laplacian 
A := gPV Ùp, (8.52) 
gives : 7 _ a 
R+NFNYRv = -L (EP YZ kag + AL) + R+Tr(k)? - 2Tr (RP), (8.53) 
Since _ 
Leap = Kap (8.54) 
by (8.21), we have 
L gP = 219P, (8.55) 


cf. (7.29), and hence 
T N 


lt (k) Kup, EP = LT =E); (8.56) 
where of course £,,Tr (k) = eo(Tr(k) 


R+NFN’Rw=-L (2, Tr (K) + ÅL) +R + Tr (FY. (8.57) 


). Hence (8.53) may be rewritten as 


Using (7.185), we finally obtain the twice contracted version of (8.38), namely 


R=R-2L° (LTr (K) + AL) +Tr(k)* + Tr (K). (8.58) 
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8.3 The 3+1 decomposition of the Einstein equations 


We now have all information for projecting the Einstein equations (7.1), with Tuy decomposed 
according to (7.61), in three different directions, namely, contracting with:”>” 


e The spatial projection fagy which gives the dynamical equations 
Legkyy = —V uV yL + L(Ruv + Tr (k)kuv — 2kay +42((S—E) Suv —2Suv)); (8.59) 
Leo Suv = —2Lkyy. (8.60) 


These follow from (7.62), (8.51), (7.63), and (8.21). As already noted, in X-adapted 
coordinates eq. (8.60) becomes (8.22), and with (8.59), one may write this system as 


(0, — Ys)kij = —ViV jL+L(Rij + Te (k)kij — 2%, + 40((S—E)&ij —28;j)); (8.61) 


(0, — Le) Bij = —2Lkij, (8.62) 
where, using (2.94) and (3.73), respectively, the two Lie derivatives may be written as 

Loki = S' pki; +k 0S! +k 0;S'; (8.63) 

Lg; = ViS; tV jsi (8.64) 


e The timelike projections NHN’, which gives the Hamiltonian constraint 
R—Tr(k) + Tr (Š)? = 162E, (8.65) 
which follows from (7.1) and (7.185). It plays a key role in (canonical) quantum gravity. 
e The mixed projections NY or BBN H, producing the momentum constraint 
Vuky —VyTr (k) = 8P. (8.66) 
This follows from (7.1), whose g,yR term contracts to zero, and (7.190). Equivalently, 
V jk! —ViTr (F) = 8aP,. (8.67) 


Altogether, in adapted coordinates, eqs. (8.61), (8.62), (8.65), and (8.67) form a coupled system 
of 16 PDEs for 16 unknown functions (§; jki jL, 5) defined on the Cauchy (hyper)surface %, 
where the k; j may be exchanged for the time-derivaties 0,g;; through (8.62), leaving 10 coupled 
PDEs for 10 unknowns ( Zij L, Si ), similar to the original covariant Einstein equations (which are 
10 coupled PDEs for the 10 components guy of the four-dimensional metric). In the latter case, 
the spatial part consists of six evolution equations, whereas the other two parts contain only first 
time derivatives of the spatial metric and no time derivates of the lapse and shift functions at 


all; hence these act as four constraints on the initial data ($;;,,8;j), or, in general, on ($; jki ay 


Also cf. §7.5. The lapse and shift functions are not determined by the equations at all and hence 
can be (more or less) freely chosen; doing so amounts to fixing a (local) gauge, see $8.1. In that 
respect, the diffeomorphism invariance of the original (covariant) Einstein equations (7.1) has 
been traded for the arbitrariness of the lapse L and the shift S and hence of the foliation. 


The precise way these equations are equivalent to the Einstein equations is as follows: 


352 The letters Sand S uv on the right-hand sides below refer to the energy-momentum tensor, whereas the S in Zs 
on the left and the S’ on the right refer to the shift vector, sorry! 
353 See Fischer & Marsden (1979), Theorem 4.1. 
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Theorem 8.2 Let (M,g) be a globally hyperbolic space-time equipped with a foliation (8.1) 
by spacelike Cauchy surfaces &,, and associated lapse L and shift S. Let (&(t),k(t)) be the 
(Riemannian) 3-metric and exterior curvature on %, induced by the (Lorentzian) 4-metric g. 

Then g is a solution of the Einstein equations (7.1), possibly coupled to matter with conserved 
energy-momentum tensor Tv in the sense that V"Tyy = 0 holds identically without (7. Deu: 


1. For some t the pair (;;(t),kij(t)) satisfies the constraint equations (8.65) and (8.67); 
2. The maps t + &;j(t) and t œ k;;(t) satisfy the evolution equations (8.61) - (8.62). 


This follows from our computations showing that (8.61), (8.62), (8.65), and (8.67) are equivalent 
to the Einstein equations (7.1). Furthermore, the proof in §7.5 that the constraints propagate is 
the same as for the vacuum case, see (7.126) - (7.127) and surrounding text. 


Conversely, for given lapse L > 0 and shift S one can only expect existence and uniqueness 
of an ensuing space-time (M,g) solving the vacuum Einstein equations locally in time, i.e. in 
some nbdh of È, since there is no a priori global control over the foliation that (L, S) give rise to. 

Theorem 8.2 understates the importance of the constraints for GR. In fact:”>> 


Theorem 8.3 A (globally hyperbolic) space-time (M, g) satisfies the Einstein equations Guy = 0 
iff the Hamiltonian constraint (7.148) holds on every spacelike (Cauchy) surface & C M. 


Proof. The implication from left to right is obvious, so assume (7.148) holds on every spacelike 
surface (we leave it to the reader to insert the words between brackets in the proof). As we have 
seen via eqs. (7.189) and (7.185), the Hamiltonian constraint (7.148) is, pseudo-covariantly, 


at each x € %, where N is the (fd) normal to &. Requiring this for all spacelike (Cauchy) surfaces 
=X comes down to asking (8.68) for every timelike vector field N. If N, and N» are fd timelike, 
then so is Nj + N2, which shows that (8.68) implies the seemingly stronger condition 


NUN Gy =0, (8.69) 


for all timelike N; and M2. Furthermore, any spacelike vector X equals X = Nı — N? for some 
timelike N; and N», so that (8.68) implies N“X "Guy = 0 for all timelike N and spacelike X, 
and by the same argument, X a Xy Guy = 0 for all spacelike X; and X2. Finally, any vector Y is 
Y = N +X for timelike N and spacelike X, so that (8.68) implies Y“Z”Gyy = 0 for arbitrary 
vectors Y and Z. This is obviously equivalent to Guy = 0. 


The simplest, perhaps somewhat trivial illustration of this formalism is Minkowski space (IM, n), 
foliated as M = UeRd;, where I, = {(x°,x) | x? = t}, corresponding to the time function 


ta X) =X. (8.70) 
In the usual coordinates one has g = n, so for this foliation the lapse and the shift are simply 


L=1; S=0. (8.71) 


354This is the case, for example, if Tuy is obtained from a matter action Sy via (7.77), where Sy is obtained by 
minimal coupling, in that in some special relativistic action, Nyy and 0, are replaced by guy and V,,, respectively. 
See Anderson (1981) and Read, Brown, & Lehmkuhl (2018) for interesting perspectives on minimal coupling. 

355The theorem, due to Moncrief & Teitelboim (1973), is valid with and without the words between brackets. 
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Furthermore, if we take & = X49 as our Cauchy surface-which it clearly is-then the induced initial 
data on È are g;; = 6;; and, since the fd normal N= (1,0,0,0) is independent of (x!,x?,x?) (and 
even of x), we have k; j = 0. Let us not fail to notice that these initial data satisfy the constraints 
(7.148) - (7.149), i.e. (8.65) and (8.67) in vacuum (E = 0 and P; = 0). From this, we recover the 
(Minkowski) metric on any other %, by solving (8.61) - (8.62) with (8.71), that is, 


Oki = R;;+Tr (k)kij — 2K; 3 (8.72) 

On8ij = —2kjy, (8.73) 

with initial conditions g;;(0) = ô; and k;;(0) = 0, and R;;(t) seen as function of g;;(t). The 

unique solution is g;;(t) = 6;; and k;;(t) = 0 for all t € R, upon which (8.14) gives guy = Nuv- 
Now make this example nontrivial by considering the curious space-time (Z7 (0), n), i.e. 

M=1*(0) (8.74) 


is the interior of the forward lightcone emanating from the origin in Minkowski space-time, with 
(relative) Minkowski metric. For ease of visualization we take d = 2+ 1, and set 


x" =tcosh(p); x! =tsinh(p)cos(@); x? =1sinh(p)sin(p), (8.75) 
where t > 0, p € R, and ọ € [0,27). Then define ©, C J* (0) as the hyperboloid (4.88), i.e. 
E; = H? Sa a xe er Oe) Ho ae er (8.76) 
so that M = 502+. If we take & = Xj to be our Cauchy surface in /* (0), with initial data 


& = dp” +sinh?(p)dq’; (8.77) 
k = —% = —dp* — sinh? (p)dQ’, (8.78) 


then (g,k) satisfy the vacuum constraints (7.148) - (7.149). To check this, one may use (4.85) 
with n = 2 and k = —1, so that R;; = —g;; and R = —2. Secretly the initial data (8.77) - (8.78) 
were obtained from the Minkowski metric 7 expressed in the coordinates (t,9,@), which is 


n = —dt? +t (dp? + sinh?(p)dp?), (8.79) 
This trivially reproduces $ in (8.77), and also leads to k via the fact that the normal of X£; is 
N = (cosh(p),sinh(p) cos(@),sinh(p) sin(@)). (8.80) 


which happens to be independent of t. To recover the Minkowski metric from the initial data 
(X1,%,k), we once again choose (8.71), as will be justified a posteriori, and then solve (8.72) - 
(8.73) subject to these initial data. A nontrivial computation shows that the solution is given by 


Elt) = (dp? + sinh” (p)d@”); (8.81) 
k(t) = —t~'g, = —t (dp? + sinh” (p)d@’). (8.82) 


Once again using (8.14) with (8.71), this duly recovers the space-time metric (8.79). 
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The cosmological FLRW solution provides another illustration of the 3+ 1 formalism. Short 
of giving the whole story, our starting point is that homogeneity and isotropy imply that the 
3d Riemannian manifold (%,£) carrying the initial data is one of the three spaces (£c, &c) of 
constant curvature studied in $4.4. These spaces are parametrized by C € {—1,0, 1}, ie.°° 


wien: Yo = R; =<. (8.83) 
The associated 4-metric is given by 
g= -dt +a(t)*gc, (8.84) 


where the scale factor t ++ a(t), initially defined on R7, is to be determined on the basis of the 
Einstein equations and the specification of some energy-momentum tensor Tuy. The latter is 
assumed to take the perfect fluid form (7.73), where u" = (1,0,0,0), so that 


Tuy = diag(€, p, p, p). (8.85) 


Accordingly E = £, P, = 0, and Sj; = poij; cf. (7.75). Since L = 1 and S = 0, it follows from 
(8.23) and (8.84), where 


g=a’gc, (8.86) 
that 
k;=-(äla)fij (8.87) 
and hence 7 
Tr (k) = —3a/a. (8.88) 
Here a depends only on ż (i.e. it is constant on Uc), so that 
Viki; = —(a/a)V ij = 0, (8.89) 
as well as _ g 
V;Tr(k) = d;Tr (k) = 0. (8.90) 


Since P, = 0, the momentum constraint (8.67) reads 0 = 0 and hence is satisfied. Noting that 
R=6C/a’ (8.91) 


from (4.85), the Hamiltonian constraint (7.148) becomes 


ex 2 
4 (£) en (8.92) 


Since eq. (8.60) has been incorporated, what remains is (8.59). After some reshuffling, including 
removing the R term using (8.92), contracting with 8} gives the second Friedman equation 


ä An 

- = —— (E€ + 3p). 8.93 

E 3 (€+ 3p) (8.93) 
Textbooks show how to solve (8.92) - (8.93), supplemented with an equation of state (such as 


p = 0, describing dust, or p = €/3 for photons). One needs considerable philosophical skill and 
courage to deny that the ensuing expansion of the universe is a real process in time! See §8.11. 


356To avoid confusion with the extrinsic curvature k; j we now write C for the constant (curvature) k in $4.4. 
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8.4 Static, stationary, and asymptotically flat space-times 


From an expanding universe, we now move to the opposite end of the dynamical spectrum. 
A space-time (M,g) might naively be called stationary if there is a globally defined complete 
timelike Killing vector field X for g, and static if in addition, M admits a foliation a la (8.1) for 
which X is orthogonal to each leaf &,. By the Frobenius theorem this is the case iff 


dX’ \X’ =0; 5 Xp VX =0, (8.94) 


since this expresses the property that the distribution of all vector fields orthogonal to X is 
integrable. This has the following consequences for the metric.”>’ First, g is stationary with 
respect to X iff at least away from the zeros of X (if any), in coordinates where X = 0,, 


g=-I’(dt+0)’ +8 0 = Odx'; & = gijdx'dx’, (8.95) 
where L, 6;, and g;; are independent of t. The static case then has 0 = 0, i.e. 
g=-Ldr’ +, (8.96) 


where coordinates are such that x = (t,x) lies in È; as in (8.1), and X are coordinates on I, S X. 

Thus a metric is static iff it is stationary and invariant under time inversion t ++ —t. The 
exterior Schwarzschild solution (9.15) for r > 2m is static, whereas the Kerr metric is stationary; 
time inversion makes the black hole rotate the other way round, both with X = d,, But if we 
extend the Schwarzschild solution to 0 < r < 2m, as explained in $9.2, then d; becomes lightlike 
at r = 2m and even spacelike when 0 < r < 2m, so that the definition of a stationary space-time 
has to be relaxed if we wish to cover such cases. This is done as follows (see part 4):°°® 


Definition 8.4 7. A 3d Riemannian manifold (%,£) is called asymptotically flat if: 


(i) There is a bounded set K C X whose complement X\K is a finite union of ends £$", 
each of which is diffeomorphic to R’\B} (where B} = {X € IR? | x? +y? +2 < 1}). 


2 


(ii) For each œ = 1,...,£ there exists a coordinate chart Qa : X&' > R?\B? in which the 
a 1 
3-metric g% = 8ipext is asymptotically Euclidean in the sense that, pointwise as |x| — &, 
a 


ER — Gil + Fa? (HI + LR? aaa’ = OR). 897 


(iii) The Ricci scalar R of & is integrable, i.e. fz @,|R| = Jy d’x \/det(%) (x)| (x)| < «. 


357Here we follow Chrusciel (2020), §4.3.1 and then §4.3.7, where omitted details are simple exercises. 

358Such definitions go back at least to Lichnerowicz (1955) and have been made increasingly precise afterwards. 
For part 1 see Lee (2019), Definition 3.5, in which we take the simplest decay conditions. One may generalize 
O(|x|)~! in (8.97) to O(|x|~?) for some p € (4, 1], in which case (8.98) generalizes O(|x|~~) to O(|x|-?~!); one 
needs p > 1/2 for the asymptotic mass TI? in (8.103) to exist, and p < 1 for it to be potentially nonzero. In the 
presence of matter one furthermore requires |E (X)| + |P;(X)| = O(|x|~3), cf (8.65) and (8.67), cf. Cederbaum & 
Sakovich (2018). But if the constraints (8.65) - (8.67) and the dominant energy condition hold, this may be replaced 
by our condition (iii), taken from Schoen (2009), Lecture 9, which condition is very convenient in practice. A 
completely different way of defining asymptotic flatness, going back to Penrose, will be discussed in §10.3. 
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2. An initial data set (X, 3,k) is asymptotically flat if (2,8) is, and k( = kze satisfies 
DI + EA @)| = Ole). (8.98) 


3. A space-time (M, g) is asymptotically flat if it has a spacelike hypersurface 1: XM 
for which the induced data set (Ł,8,k) given by the induced 3-metric & = ı*g and the 
second fundamental form k of the embedding ı is asymptotically flat (as in items 1—2). 


4. An asymptotically flat space-time is stationary if it has a complete Killing vector field X 
that at each end I" is timelike, and L and 6; in (8.95) are O(|X|~!) as in (8.97). It is 
static if X in addition satisfies the integrability condition (8.94), so that 0 = 0. 


The idea is that the Killing vector field X defining stationarity need only be timelike “far away”. 
In asymptotically flat stationary space-times, the flow @; of X (assumed complete) consist of 
isometries and since far-away observers with four-velocity u = X / ,/—g(X,X) (who consider 
themselves at rest) move along the flow lines, they see no change. One does need the full 
complexity of this definition, since already the maximally extended Schwarzschild space-time 
(i.e. the Kruskal solution) has two ends. Noting that in coordinates where (8.96) holds the shift $ 
vanishes, it should be clear from (8.61) - (8.62) that the static case, simply corresponds to 


kij =0. (8.99) 


In that case at least in vacuo the momentum constraint (8.66) is identically satisfied, whereas the 
dynamical Einstein equation (8.61) and the Hamiltonian constraint (8.65) simplify to 


VVL=R;jL; (8.100) 
R=0, (8.101) 


respectively. Contracting (8.100) with 8” and using (8.101) gives A gL=0, where Ag = ZIV j 
is the 3d Laplacian determined by g. In the presence of (8.100), this is equivalent to (8.101), so 
that the Einstein equations for a static space-time are also given by 


ViV ;L = RijL; AgL = 0. (8.102) 


The oldest rigorous result in this context is Lichnerowicz’s theorem from 1939, which 
states that if (M,g) is static, asymptotically flat, and geodesically complete (a property the 
Schwarzschild space-time lacks), then (X, 2) is isometric to flat Euclidean space and L = 1, so 
that (M,g) is isometric to Minkowski space-time.*°? This follows from the theory of the Laplace 
equation and the boundary condition L — 1 at spatial infinity, cf. (8.96) and (8.97). 

More generally, any geodesically complete stationary space-time solving the vacuum Einstein 
equations is isometric to IR x X with flat metric,°° so that the assumption of asymptotic flatness 
in Lichnerowicz’s theorem is only needed to enforce & = R3. See also Theorem 10.25 in §10.9. 

As will be justified below from physics,*°! the asymptotic (ADM) energy TTI? is defined by 


m := calm |, ao (81; — diyy), (8.103) 


where d?o' = x'sin@d6dQ, with X = (rsin 0 cos @, rsin 8 sing, rcos 0) as usual. 


35°Choquet-Bruhat (2018) reports that this result even impressed Einstein, who had been unable to prove it. 
360See Anderson (2000a), Chruściel, Lopes Costa, & Heusler (2012), and Cortier & Minerbe (2016). 
361 There are many other concepts of “mass” in GR, reviewed by Galloway, Miao, & Schoen (2015) and Lee (2019). 
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The most famous “elliptic PDE” result in mathematical GR is the positive mass theorem:>° 


Theorem 8.5 Let (2,8) be (geodesically) complete and asymptotically flat, with R > 0. 
Then TI? > 0, with equality TI? = 0 iff (2,8) is isometric to Euclidean space (R°,6). 


The assumption R > 0 may be motivated by noting that in static space-times one has 
R= 167E, (8.104) 
which is the Hamiltonian constraint (8.65) with k = 0. Compare with the Newtonian formula 
AV =4np (8.105) 


for the gravitational potential, with the difference that (8.105) determines V, whereas (8.104) 
merely constrains the metric g;;. In any case, R > 0 now simply comes down to E > 0. In this 
light, one may also justify (8.103) by noting that for asymptotically flat spaces one has 


R= 0,(0;8ij — jõi) + O(|x|~*), (8.106) 


where the first term comes from the first two terms in (7.26), in d = 3 of course, and the 
last comes from the T -T terms, where T contains first derivatives of ë and hence is O(|x|~). 
Assuming for the moment that (8.106) holds globally and that £ = R, the total energy I? of 
all matter plus the gravitational field may then be defined as 


1 „1 
II? := lim d’xE = —— lim d’xR= -z7 lim d’o' (98:5 — Ajj), (8-107) 


r—oo B3 Troe B3 JT ro» S2 
which recovers (8.103).°°* For example, for the spatial part of the Schwarzschild metric, i.e. 


2 =f 
g= (1 = = dr? + r?(d0? + sin? @do’), (8.108) 
r 


one obtains TI? = m.°°* Similarly, the asymptotic (ADM) momentum is defined by 
y 


1 , 
I; := — lim | d'o fij, (8.109) 
ST r—oo s2 


where the canonical momentum 7;; is defined in terms of $;; and k; j by (8.209) below in §8.7. 
This leads to a generalization of Theorem 8.5. Let asymptotically flat initial data (£, g, k) satisfy 


UR-Tr(R?) + Tr(k)?) > ||V jk — VT r (F) |g. (8.110) 


Then TI? > ||TT|]. If the constraints (8.65) - (8.67) hold, eq. (8.110) is equivalent to E > ||P}. 


362 The original proof is due to Schoen & Yau (1979, 1981); see also Schoen (1989, 2009). For spin manifolds 
Witten (1981) and Parker & Taubes (1982) proved the theorem in a completely different way. See also Lee (2019) 
for both proofs. The Riemannian Penrose inequality sharpens the positive mass theorem; see §10.11. 

363 Integrability of R and existence of TI? are equivalent, and since the former is in Definition 8.4, the latter exist. 

364See Poisson (2004), §4.3.2, Gourgoulhon (2012), $8.3, Example 8.1, or Schoen (2009), Lecture 9. An efficient 
way to do this computation, following the latter, is to write &;; = (1 +m/2|x|)*6;; + O(1/|x|"). This gives the 
integrand as x!(0;8;; — 0;8;;) = 4m(1 +m/2|x|)3/|x| + O(1/|x|?). As r > the error term does not contribute, 
whilst the first gives fo d?o' (0;8ij — 9:8jj) = 16mm(1 +m/2r)?, which as r — œ yields 162m, and hence II? = m. 
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The proof of Theorem 8.5 (which implies the generalization) is lengthy and difficult, but the 
main steps are as follows. First, as explained in more detail in §8.6 one may apply a conformal 
transformation ê = 8, where the strictly positive function Q € C” (£) solves the linear PDE 


(A—1R)Q=0. (8.111) 


Then R = 0, where R is the Ricci scalar for g. Since the flat space Laplace equation Af = 0 in 
d = 3 has fundamental solution f = C/r, we obtain Q = 14+C/r+O(1/r?). The point is that 
C <0, as follows by integrating the equality AQ = 1RO over a three-ball B? and using R > 0. 
Hence IT°(g) = TI? (g) — C < TI} (g), which reduces the proof to the case R = 0. 

The proof then proceeds by contradiction. If TI? < 0, then one can find a smooth asymptot- 
ically flat metric g that equals g in BS for some p > 0 and equals (16 outside B: this works 
because, as before, Č = 1 +C'/r+ O(1/r?) for some C’ < 0, and we now have C = IT°. This 
metric can, in turn, be used to construct a new metric g’ that is exactly Euclidean outside some 
three-ball and has R’ > 0. This, however, contradicts the following remarkable lemma:*™ 


Lemma 8.6 If a Riemannian manifold with Ricci scalar R > 0 is isometric to Euclidean space 
IR"\ Bj, outside some compact set (for some p > 0), then it is isometric to (IR",6). 


This argument proves the first part of Theorem 8.5. A very elegant argument for the second 
claim comes from Ricci Flow, the technique used to prove the Poincaré conjecture.°°° Here 
(especially in d = 3) a “time” dependent Riemannian metric g(t) satisfies the parabolic PDE 


O8ij 5 
= —2R;ij, 8.112 
ot í en 
from some given initial metric $(0). This induces a flow of the Ricci scalar R, namely 
OR = AR + 2R; RY. (8.113) 


It can then be shown that TI? is independent of t (which is not surprising since it is an asymptotic 
quantity), so if II? = 0 for g(0), then TI? = 0 for all g(t). Step 1 above then shows that R(t) = 0 
and hence (8.113) yields R;; = 0. In d = 3 this means that the Riemann tensor also vanishes 
(see §4.5) and hence by Theorem 4.1 our space is locally Euclidean. Asymptotic flatness then 
prevents nontrivial topology for large r, whereas geodesic completeness forces the bounded set 
K CX in Definition 8.4.1 to be compact. Lemma 8.6 finally yields the claim. 


Towards a further (covariant) justification of the definitions (8.103) and (8.109), in the physics 
literature on linearized gravity, asymptotic flatness is expressed by the decomposition 


where nv is the Minkowski metric and hyy is “small”; this Ansatz seems predicated on the 
topological assumption M = R4. Relating this to the assumptions in Definition 8.4 is highly 
nontrivial,°°’ but since we only try to motivate (8.103) and (8.109) we omit the details. 


365See Corollary 2.32 in Lee (2019). This result, first conjectured by Geroch in 1975, is equivalent to the 
nonexistence of positive scalar curvature metrics on a torus, which is Theorem 1.30 in Lee (2019). 

366The Poincaré conjecture states that any compact simply connected 3-manifold is diffeomorphic to the three- 
sphere S?. It was proved by the eccentric Russian mathematician Perelman in 2002-2003. See Morgan & Tian 
(2007). The Ricci Flow approach to the positive mass theorem was developed by McFeron & Székelyhidi (2012). 

367See Christodoulou & Klainerman (1993) and Klainerman & Nicold (2003a). See also Weinberg (1972), §7.6, 
Misner, Thorne, & Wheeler (1973), §20.2, Jaramillo & Gourgoulhon (2009), and de Haro (2021). 
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Assuming (8.114) and topological triviality of M as above, we can expand the Einstein tensor 
Guy to linear order in A, calling the result Giv: In terms of the convenient expression 


A ahh = pehs, (8.115) 
where indices are raised and lowered through the Minkowski metric 17, this gives 
which leads to the linearized Einstein equations 
Guy © 82Tyy. (8.117) 
For later use, we note that Gr” := n”? HY? Gos may be written as 
Gh = 1dqdgHhore; (8.118) 
HEB .— p” DEB +R Pgo An av, (8.119) 
where the so-called “superpotential” H has (anti) symmetries 
HEN = ONE ney (8.120) 
One may also rewrite the full Einstein equations in exact form as 
Giy = 8h tyy := 8r (Tv + tuv); (8.121) 
tuv = (82)~'(Ghy — Guv), (8.122) 


where tuy, Sometimes seen as the self-energy-momentum (pseudo) tensor of the gravitational 
field, is quadratic in h. Eqs. (8.118) and (8.120) then immediately give the conservation laws 


Gh =0; One? = 0, (8.123) 


where the first one is an identity and the second one relies on the field equation (8.121). 

Assuming for the moment that (8.114) holds globally and that & = R°, the total energy- 
momentum IT of all matter plus the gravitational field may then be defined as 

1 
IH := lim | d’xt™ = — lim | dicht, (8.124) 
r—oo B? 8 Tr» B3 

where have used the exact equation (8.121). The same expression on the right-hand side appears 
if we define IT as the integral of T°“ instead of T°, and then use the approximate (linearized) 
equations (8.117). So either way, we may proceed u (8.118) and (8.120) to obtain 


= 3 0 — 3 Oi 
1 2 Oi 
S inf P d°0;0gH HP, (8.125) 


which is valid whatever is going on inside the compact region K C & about which we have no 
information, so that we may also take (8.125) as the definition of II“. In particular, for u = 0, 


1 i 
TI? = 167 ri f, d’0;9; HOO — = 6x i f d’o' (d;hij— ðihj;) 
1 


1 ~ ae 
= im | dor (Oju di) = TE Him m f (Vn: g—d(Try(g))), (8.126) 


which is (8.103). The ae of (8.109) from (8.124) is similar and is left to the reader. 
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8.5 The origin of diffeomorphism invariance? 


Having seen the linearized Einstein equations (8.117), it would be a pity not to mention an 
argument for (at least infinitesimal) general covariance that at least sheds new light on this 
property compared to the kind of arguments mentioned in $1.10. General relativists do not like 
this argument, since it takes place in Minkowski space-time and puts special relativity before 
general relativity. But special relativists (i.e. particle physicists) do, for the very same reason!*°* 

The argument is based on the unitary irreducible representations of the Poincaré group P, 
which were classified by Wigner (1939).°° We first define P as the semidirect product 


P:=0(3,1) x RÍ; (8.127) 
0(3,1) := {A € GL4(R) | (Ax, Ay)u = (x,y) Yx, y € RÍ}, (8.128) 


where (-,-,)y is the Minkowski inner product in R4. One calls 0(3, 1) the Lorentz group. This 
means that P consists of pairs (A,a) € O(3,1) x R4, equipped with group operations 


(A,a):(NA,d) = (AN,a+A-d); (8.129) 
Aa)? = (A ka) (8.130) 


The full Lorentz group O(3,1) has four connected components, which may be identified by the 
(independent) conditions det(A) = +1 and +A°% > 0. For the moment we restrict ourselves 
to the connected component containing the identity (which is like SO(3) in O(3)), in which 
det(A) = 1 and A°9 > 0. This group, which we call L, is the proper orthochronous Lorentz 
group. It gives rise to Py = L x R*, which is the connected component of the identity in P. If we 
write the L-action on Rf as x” +> A",x’, then A € GL(4,IR) should satisfy 


gh” Dp = tiv: (8.131) 


Wigner showed that it is the dual action of L on (IR*)*, seen as 4-momentum space, that counts 
for the classification: if we denote elements of the dual vector space (R*)* = R* by p u, then 
the dual action is py ++ A," py, where, as the notation indicates, A,” = n wan’? N: 


Writing p? = —p&+ p? + p3 + p3, the L-orbits @ in (IR*)* = R are easily seen to be: 
1. Ø = {(0,0,0,0)}, with stabilizer Lo = L; 


2. OF = {pe RÍ | p? = —m?, +p? > 0}, m > 0, with stabilizer Lp  SO(3); 


m 


3. 05 = {p € RÍ | p° =0,+p° > 0}, with stabilizer Lo = E(2) = SO(2) x R?; 
4. Ojm = {p € R* | p? = m?}, m > 0, with stabilizer Lo = SO(2, 1). 


Here the stabilizers Lo are found by taking reference points (+m,0,0,0) in case 2, (+1,0,0, —1) 
in case 3, and (0,0,0,m) in case 4. The physically relevant cases seem to be Cr and Oo , since 
Qim describes tachyons, which probably do not exist. The unitary irreducible representations 
of Pp are then labeled by pairs (0, 7), where © C (IR*)* is one of these orbits, and X labels a 


unitary irreducible representation of the corresponding stabilizer (which depends on @). 


368 See Weinberg (1972), §10.2, Scharf (2016), and, reluctantly, Misner, Thorne, & Wheeler (1973), Box 17.2 (5). 
369See also Barut & Racka (1977), chapter 17, and Landsman (1998), §IV.3. 
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In particular, for ©} with m > 0 one has X = j € {0,1,2,...}. This describes the spin of the 
(elementary) particle described by the given representation,”’” so that the total label is (m, j). 
For m = 0, on the other hand, x labels unitary irreducible representations of the 2d Euclidean 
group E(2). This would in principle involve another continuous label, but it seems that in reality 
only the case occurs where the elements of R? are represented trivially, so that one just needs a 
label for SO(2), which is A € Z, called helicity. Denoting elements of E(2) = SO(2) x IR? by 
(z,x,y), where z € SO(2) & T and (x,y) € R?, helicity is just the character 


u, : E(2) > T; uz (zxy) =. (8.132) 


Including parity, i.e. diag(1,—1,—1,—1), in L then forces A to be accompanied by —A. The case 
A = 0 does not occur, but the pair A = +1 describes photons whereas A = +2 gives gravitons. 
In this light, traditional (relativistic) quantum field theory may be understood as follows: 


1. Linear field equations distill specific unitary irreducible representations (typically realized 
in momentum space) from covariantly transforming fields (defined in space-time); 


2. Nonlinear terms (often dictated by other symmetries than Poincaré invariance) are added 


to the equations to describe interactions between the elementary particles thus involved.*7! 
For example, the Klein-Gordon equation (O — m?) @ = 0, where O = —9? + A and ọ : R4 — R 
is areal scalar field, selects the representation labeled by (m, +, j = 0). Namely, we write 
ap ; 
= —— eÂ (p), 8.133 
= |, B OO) (8.133) 


where px = pux” with p? = Vlpl?+m2, which solves the Klein-Gordon equation, and take 
Ô € H =L’(R°*,d°*p/p’), (8.134) 
whose measure d? p/ p? is Lorentz-invariant. The natural space-time covariant P-action 
(A,a):p(x)=Y(A(x-a)) (8.135) 
then corresponds to Wigner’s realization of the (04,0) unitary irreducible representation, i.e. 
Ucn.+,0)(A.a)6(p) = e'?4@(A“'p). (8.136) 


For m > 0 this can be generalized to arbitrary spin j > 0, but our interest lies in the case m = 0, 
where we again recall Wigner’s manifestly unitary (but unrecognizably space-time covariant) 
formula for the representation Ui, ,)(P) labeled by the orbit ØJ and the helicity A € Z. For 
any A this is realized on the same Hilbert space (8.134), but now (as in the case of higher spin) 
Wigner’s formula for the explicit realization requires a (measurable) cross-section 


b: 65 >L (8.137) 


370By another analysis of Wigner, one should also include representations of the universal covering group of P, 
which allows j to be half-integral, too. Likewise for helicity, but this complication is not needed in what follows. 

37|The reason general relativists frown on this is precisely that in GR it is the full nonlinear (Einstein) field 
equations that are more natural than the linear ones, and a similar point in fact applies to Yang—Mills theories. 
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of the canonical projection 
n:L>6, =L/E(2), (8.138) 


Le. Tob=id 0 where without loss of generality we may and will assume that 
b(p)=e, (8.139) 


the unit element of L. The (L-equivariant) diffeomorphism AF S L/E(2) is p > [A], where 
A € Lo satisfies p = Ap’ with pi, = (1,0,0,1) and [A] is its equivalence class (i.e. image under 
T) in L/E(2). Moreover, we have AF = IR? by mapping (p°,p) € AF with p° = |p| to p € R3; 
this diffeomorphism is also L-equivariant if we define Ap as the spatial part of A T pv. One then 
verifies that the Wigner cocycle b(p)A b(A~'!p) lies in Lo = E (2). Then for any (A,a) € Po, 


Uio, +2) (4a) (p) = "u, (b(p)Ab(AT'p))W(ATp). (8.140) 


For A = +1, we now relate this unitary yet mysterious expression to the manifestly space-time 
covariant action of Po on the electromagnetic field potential Ap, on which we simply have 


(A,a) -Au(x) = A Ay (A! (x—a)). (8.141) 


The idea is that we pass to a new space, consisting of solutions of the Maxwell equation 


Ayu — dA) = 0, (8.142) 


cf. (7.87), modulo gauge transformations 
Aut>Ayt+oyd. (8.143) 


Both the equations and the quotienting are Poincaré invariant: P maps the solution space to 
(8.142) to itself, and if Ay ~ Aj, in that Ay =A‘, + 9uA, then (A,a) -Ap ~ (A,a) -A for any 
(A,a) € P. The second (quotienting) step may, in turn, be performed in two stages: 


1. Find a representative A, of Ai, in its equivalence class under (8.143) by imposing the 
Lorenz gauge 0’ Ay = 0, cf. (7.92). This can be done by solving A from DA = —0V A‘. 


2. Quotient by the residual gauge transformations within some class of solutions of the pair 


Au = 0; dv Ay = 0. (8.144) 


The A in the residual gauge transformations (8.143) following (8.144) should then satisfy 


à =0. (8.145) 


The first equation in (8.144) is solved by the spatial Fourier expansion 


d’? Bi 
Ao he can e"Au(p). (8.146) 


where this time p? = |p| and each component Au € H as in (8.134). The second equation in 
(8.144) comes down to pHa u(p) = 0. Under Lorentz transformations, from (8.141) we have 


(A,a)-Ay)(p) = e "PA, Â (Atp). (8.147) 
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To make this look more like Wigner’s unitary expression (8.140) we change variables to 
Au(p) = (b(p)~')p"Av(p), (8.148) 
so that (8.147) becomes 
(A,a) -Au(p) = e"'?“(b(p)Ab(A'p))WÄY(AT'P). (8.149) 
Taking p = (0,0,1) =p’ and hence (p®, p) = p’, and A € E(2), eqs. (8.140) and (8.149) become 


U(o,+.a)(A,0) up) = u, (AWP); (8.150) 
(A,0) -Âu (p) = A,’Av(p’), (8.151) 
where we used (8.139), the (defining) property Ap’ = p’ for A € Lo = E(2), eq. (8.147), and 


A, (p’) = Ay (p), which follows from (8.148). Therefore, in order to compare the covariant 
transformation (8.141) with the unitary representation (8.140) all we need to do is look at the 


solutions Â; (p) of the two equations (8.144), quotiented by (8.143) at the special point p = p’. 


At this point,*/* the second equation in (8.144) and gauge transformation (8.143) become 


A A 


Ao(p’) = A3(p’); (8.152) 
Ao(p’) + Ao(p’) + iA (p'); A3(p') + A3(p’) + iù (p); (8.153) 
Âi (p) + Ai (p'); Ax(p’) + Asp), (8.154) 


where A € H ‚as in (8.134), defines the residual gauge function A, required after all to satisfy 
(8.145), by a formula analogous to (8.146). Since Ao can be eliminated in favour of Az and the 
latter is pure gauge, is clear that, frozen at p’, the true unconstrained degrees of freedom after 
solving (8.144) and quotienting by the (residual) gauge freedom, are A, and Ap. 

The computation of the right-hand side of (8.151) and its comparison with (8.150) relies on 
the precise embedding of E(2) in L. To describe this, we use a specific basis of the Lie algebra I, 
of L, which consists of all real 4 x 4 matrices M that exponentiate to L; this comes down to the 


condition Muy = —Myu, where Muy = M u Nav- We then take the following basis of [:?”° 
0100 0010 0001 
1000 0000 0000 
B=|goo0ool Prlıoool B=loooo0 |]? 1 
0000 0000 1000 
0000 0000 00 0 0 
000 0 0001 00-10 
A=|900-1 |’ @=lo0 000]? 2=|o1 oo, S 
001 0 0-1 10 00 0 0 


which satisfy the commutation relations (i.e. Lie brackets) appropriate to the Lie algebra I, viz. 


[B;,B;] = —Eijpde, [Fi Ji] = EijkJko [Ji, By] = £ijkBk. (8.157) 


3720f course, one should handle this carefully, since functions in Z? do not have a value at any particular point. 
The functional analysis of this situation (and the next) is correctly handled in Landsman & Wiedemann (1995). 
373Note that these matrices are the M,” = n” M vy, not the Myy on which the condition Muy = —My, is imposed. 
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Given the choice Pu = (1,0,0,1), the stabilizer E(2) is generated by the elements 


010 0 001 0 
Ti =B -h= : i F > Bh=B+h= A Be , (8.158) 
010 0 001 0 
and J3, each of which duly annihilates (1,0,0,1). The commutation relations are 
Ti, h] = 0, [J3,T1] = D, [J3, |] = —-Th, (8.159) 


as appropriate for E(2). Here (T4, T2) generate R? and J3 generates SO(2) in SO(2) x R?. 

As we have seen, the true degrees of freedom at p’ = (0,0, 1), ie. Pu = (1,0,0,1) are the 
transverse components A; and A>. The action of E(2) C L on A, (p’) given (in infinitesimal 
form) by the above matrices then descends to an action on C as realized by (A;,A2), seen as 
the quotient of C4, consisting of the 4-vectors (Ao, A1,A2,A3) with Ao = As, by the C-action 


(A3,A1,A>,A3) > (A3 + iA, A},A2,A3 + id). (8.160) 


From the above matrices, this gives 


Âi) h ÄN\_.n. ÄA\_/(%& 
n (4, )=0 2 (4, ) =o; (4! )=( 4 ). (8.161) 


This means that the IR? in E(2) acts trivially whilst the SO(2) acts in its defining representation 
on R?, albeit complexified to C?. The hermitian matrix iJ3 is diagonal in the basis 


u4 = (e1 tier) /V2, (8.162) 


with eigenvalues +1. Thus the E(2)-action defined in (8.149) is a direct sum of the characters uy 
with A = +1. In sum, we have proved the following result (where the +1 are these helicities): 


Proposition 8.7 The space obtained by solving the Maxwell equation (8.142) modulo the gauge 
transformations (8.143) is isomorphic, as a (Hilbert) space on which the Poincaré group P acts 
via (8.141), to the direct sum of the unitary irreducible representations U, (0,41) and Uo,+,-1): 


We now repeat this analysis for linearized GR. Instead of the vector Ay, we have the 
symmetric tensor hyy, see §8.4, which under the Poincaré group transforms as 


(A, a) -huv (x) = Ay Ag hop (A! (x—a)). (8.163) 


Instead of the (free) Maxwell equation (8.142) we have 


i.e. the (free) linearized Einstein equations, see (8.116) - (8.117). Instead of the gauge transfor- 
mation (8.143) we have the (linearized = infinitesimal) coordinate transformations 
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Finally, instead of the Lorenz gauge condition O"A,, = 0 we have the analogous equation 


see (8.115), which is obtained by linearizing the wave gauge (7.108). This reduces (8.164) to 


Given (8.166) - (8.167), one may carry out residual gauge transformations (8.165), provided 


Eu = 0. (8.168) 


Quite analogously to the electromagnetic case, we now show that for helicity +2 we have: 


Proposition 8.8 The space obtained by solving the linearized Einstein equations (8.164) modulo 
infinitesimal coordinate transformations (8.165), or, equivalently, the space of solutions of 
the pair of equations (8.166) - (8.167) modulo the residual transformations (8.165) where Eu 
satisfies (8.168), is isomorphic, as a (Hilbert) space on which the Poincaré group P acts via 
(8.163), to the direct sum of the unitary irreducible representations U(g,, 7) and Uo,4,-2): 


Proof. Once again, we start by solving (8.167) analogously to (8.146), upon which a simple 
computation based on (8.166) and (8.165) shows that (8.152) - (8.154) is replaced by 


hor = hy3; hon = has; hos = 4 (hoo + ha3); În = -hıı; 
hoo > hoo + 2iĝo; hor = hor + iêr; hoo + hor + iz; hos = hos + i(& + &); 
hy > Îi; hy > hyp; În + hyo; 
hsrhstie; Is hos + iĝ; ha3 + ĥ33 + 2183 (8.169) 


where for simplicity we omitted the argument p’ common to all Îuv. We conclude that the 
unconstrained and ungauged degrees of freedom are (Îi1,ĥ12). Similarly to (8.161), this gives 


ħi ) & ) (gi ) ( iio ) 
Ti | » = 0; D| » = (0: J A =? A , 8.170 
( hı2 *\ hie N Ân hıı 


where the factor 2 arises from the product A-action on the right-hand side (8.163), which in turn 
leads to a sum of J3-actions. So also here, we introduce polarized states (8.162), where this time 
e; is the unit vector for the h1; component, where i = 1,2. 


From the point of view of the representation theory of the Poincaré group, linearized gravity 
describes massless particles (gravitons) with helicity +2. The full Einstein equations then add 
self-interactions of this particle in a seemingly beautiful and consistent way. Unfortunately, no 
one has been able so far to construct a renormalizable quantum field theory on this basis. But 
this argument does show that the origin of the diffeomorphism invariance of GR, though here 
just represented at some infinitesimal or linearized level, may have it origins in quantum theory, 
in being a space-time covariant way to describe a certain massless representation of the Poincaré 
group, which is related to orbits in momentum space and realized on Hilbert space.*’4 


374One may criticize this approach for being a hybrid between classical and quantum reasoning, the former on the 
side of the linearized Einstein field equations and the latter on the side of the unitary representation theory of the 
Poincaré group on Hilbert space. But in fact a similar argument may be set up in a completely classical context, 
where the role of unitary representations is replaced by that of coadjoint orbits. The argument even improves, 
since implementing the gauge condition and quotienting by the action of the gauge group is unified into the single 
procedure of Marsden—Weinstein reduction. See Landsman & Wiedemann (1995) and Landsman (1998). 
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8.6 Conformal analysis of the constraints 


The initial value constraints (8.65) - (8.67) may be analyzed from a PDE point of view.”’° In the 
simplest case the metric is static, which means that (M,g) has a timelike Killing vector field 
u" and has a foliation M = 1; whose leaves %, are orthogonal to u” (equivalently, Ouy = 0. 
See §8.4. In that case, in the “right” (i.e. adapted) coordinates the guy are time-independent, as 
for the Minkowski metric or the Schwarzschild solution. Hence k = 0, and if we also assume 
vacuum for simplicity, then the only constraint on the ensuing initial data (£, g;;) is 
R=0. (8.171) 
This is a vastly underdetermined system, since the six independent components of the metric ;; 
are subject to just one equation. But this doesn’t mean that the solution is trivial, and in particular 
one should understand the degrees of freedom. This is a problem in pure Riemannian geometry, 
whose solution as sketched below has a long and interesting history, which is worth recalling. 
This history goes back to the uniformization theorem for Riemann surfaces:*”° 


Theorem 8.9 A simply connected Riemann surface is biholomorphically equivalent to one of: 
e The Riemann sphere 5; 
e The complex plane C; 
e The upper half plane H. 


Consequently, any compact Riemann surface ©} is (biholomorphically) isomorphic to U/T, 
where U is S, C, or H, andT is a discrete subgroup of the group of biholomorphic bijections of 
U acting freely and discontinuously on U (i.e., no T-orbit has an accumulation point).? 


This is equivalent to the following statement purely in the language of Riemannian geometry: 


Theorem 8.10 A complete Riemannian metric on a simply connected 2d manifold (and hence 
on a compact 2d Riemannian manifold) is conformally equivalent to a metric with constant 
curvature, cf. Theorem 4.9 (from which compact spaces may be constructed as in Theorem 8.9). 


Inspired by Theorem 8.10, the Yamabe problem asks if in arbitrary dimension any complete 
Riemannian metric on some closed manifold is conformally equivalent to a metric with constant 
Ricci scalar.’ This problem has been solved in the positive for compact manifolds (which are 
automatically complete), using the following strategy.’’” In d = 3, rescale the metric by 


&=0%, (8.172) 


375The approach in this section goes back to Racine (1934) and Lichnerowicz (1944, 1957). For further develop- 
ments see Choquet-Bruhat & York (1980), Bartnik & Isenberg (2004), Chrusciel (2010), Chrusciel, Galloway, & 
Pollack (2010), Corvino & Pollack (2011), Isenberg (2014), Galloway, Miao, & Schoen (2015), and Räcz (2015). 

376For a historical survey of the uniformization theorem see de Saint-Gervais (2010). Jones & Singerman (1987) is 
an accessible low-key treatment. A Riemann surface is defined through its complex structure, whereas a Riemannian 
manifold is defined by its metric. In dimension 2, complex structures up to biholomorphic equivalence bijectively 
correspond to Riemannian metrics up to the equivalence relation defined by isometry and conformal equivalence. 
See also footnote 486. By (our) convention, a simply connected space is also connected. 

37’ Equivalently, each x € U has a nbhd U such that U N y-U = 0 for all y Æ e. 

378This is the only choice among the many equivalent notions of curvature (which all coincide in d = 2) for which 
there is any hope for the problem to have a solution. See §4.5. 

379 The solution is due to Schoen (1984). See Lee & Parker (1987) and Bär (2007/08) for complete treatments. 
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where the conformal factor Q € C*(X) is strictly positive (so that g is a Riemannian metric on 
È), such that the Ricci scalar R = Rg of & is constant.**° Straightforward computations give 


R= -8Q LO, (8.173) 


381 is given by 


where the linear differential operator Ly, called the conformal Laplacian, 


in which Ay := YV;V;;is the Laplacian on È defined by y, and Ry is the Ricci scalar defined 


by y (we omit tildes on geometric quantities defined by y; those with a tilde are defined by g). 


Given y, the constraint (8.171) then becomes an equation for the scalar Q, namely 
Ly. = 0. (8.175) 


This is a linear elliptic PDE, which can indeed be solved if & is compact. In GR, this argument 
applies more generally (e.g. assuming Q — 0 at infinity in the non-compact case). 

Ellipticity is here to stay, but linearity is typical of the assumption k = 0, and in general will 
be replaced by gruesome nonlinearities. Indeed, already the next case, where 


kij 40; Tr (k) := gYki; =0, (8.176) 

is highly nonlinear.*** The constraints (8.65) - (8.67), again in the vacuum case, simplify to 
R—Tr(k*) =0; (8.177) 
gV kij =0. (8.178) 


We now also choose some symmetric tensor k;; on }, such that 
Y'Vkj=0, (8.179) 
but freely otherwise. It is easy to show that if we relate k to k via 
ek (8.180) 
then (8.179) implies (8.178) and hence only (8.177) remains, which is equivalent to 
LQ + iTr (K)QT =0. (8.181) 


This equation can be analyzed by traditional methods from nonlinear elliptic PDEs (notably by 
constructing both sub- and super-solutions, i.e. replacing “= 0” by “< 0” and “> 0”). 


380]n the context of GR, adding a cosmological constant A modifies (8.171) to R = 2A. The possible signs of R, 
i.e. R = 0,1 up to rescaling, are restricted by the topology of & and define the so-called Yamabe class of =. 

381 Our formulae are for d = 3. In general dimension d, eq. (8.172) should be g = 04/ (a-2)y, whilst the conformal 
Laplacian is Ly = Ay — caRy, with cq := t (d —2)/ (d — 1). Then (8.173) reads R = —(cgQ@4?)/(4-2)) -1 LO. 

382Foliations with Tr (k) = 0 are called maximal slicings. This is related to the Plateau Problem: if £. C M has 
Tr (k) =(, and .Y C È is a two-dimensional surface, then the volume of any three-dimensional SC È with 0S = Z 
is maximal compared to the volume of competing S C M subject to dS = Z C =. In the purely Riemannian Plateau 
Problem the volume (or, as in the original problem in one dimension lower, the surface area of the enclosed region) 
would be minimal, but in the Lorentzian case it is maximal, for similar reasons why the length of timelike geodesics 
is maximal rather than minimal (see §5.6): excursions of S outside È} are in the timelike direction and hence, through 
lightlike approximations, reduce the volume (rather than increase it as in the Plateau Problem). See also $10.11. 
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We move to the general case. Here is it customary and physically relevant to move to a 
transverse traceless version of k and k, where the traceless part is easy to define, namely 


6; = ki; — 1Tr (k) Bij; Oj; = kij — Tr (kK) Yj. (8.182) 

Adding energy-momentum and using the scaling (8.180), this reformulates the constraints as 
L/O.+1Tr(o7)Q~7 — LTr(k)?O° = -2rEO?; (8.183) 
V joi; — 2(ViTr (k))O® = 82P.0"". (8.184) 


The first of these (i.e. the Hamiltonian constraint) is called the Lichnerowicz equation. Defining 
the transverse part of o and 6 is less straightforward: there exists a decomposition 


oij = of" + (KyX)iz, (8.185) 
where op is traceless and transverse in the sense that 
Tr (0o) = 7/0;; = 0; (8.186) 
Von =0, (8.187) 
and X is some vector field, on which the conformal Killing operator Ky acts by 
(RX); = ViX; +V Xi — hV X". (8.188) 
This generalizes the usual Killing operator 
KX = V;X; + V Xi, (8.189) 


whose solutions K,X = 0 are vector fields whose flow @; consists of isometries, i.e.,  Y=Y; 
vector fields solving KyX = 0 are vector fields whose flow @, consists of conformal isometries, 
in that Ey = Oy for some Q > 0, as above. The difficult part is the reconstruction of o;; from 
its transverse traceless part oi" and X. This may be done by solving a conformal version of the 
Laplace equation, viz. 


AyX! = Vj(RyX)4¥ = AX! + VV XI FRX! (8.190) 
Note that the kernel of Ay consists of conformal Killing vectors. Likewise for g and 6;;. In terms 
of the free data 7;;, 6,1, and t = Tr (k), the determined data Q and X are found by solving the 


ij? 
final (conformal) version of the constraints, namely 


LyO.+ iTr (fr) Q - PO = -20E0?; (8.191) 
A,X! — 2(V;7)08 = 82P.0". (8.192) 

Once this has been done, ĝ;; and ki j can be (re)constructed via (8.172) and 
kij = (R Xy tog QO + 310 yy, (8.193) 


and these solve the original constraints (8.65) - (8.67) in terms of the above free data. Of course, 
the solvability of (8.191) - (8.192) is a difficult matter, which so far is only under complete 
control if Tr(k) = 0. In general the cases where È is compact or asymptotically flat are very 
different, as usual in the initial-value approach to GR, and the field is still in development. °?* 


383 See e.g. Galloway, Miao, & Schoen (2015), Holst, Maxwell, & Mazzeo (2017), and Carlotto (2021). 


Hamiltonian formulation of general relativity 


199 


8.7 Hamiltonian formulation of general relativity 


The Einstein equations admit a (constrained) Hamiltonian formulation, which goes back to Dirac 
and (independently) Bergmann in the 1950s. Their work was streamlined by Arnowitt, Deser, and 
Misner in the early 1960s, and in the 1970s was brought into mathematically rigorous form by 
various teams.*** The Hamiltonian approach does not differ dramatically from the PDE approach 
as presented in §7.6 and $8.3, where both the initial data (£, &,k) and the equations of motion 
(8.61) - (8.62) were already brought into an almost Hamiltonian form, except that the Hamiltonian 
and the Poisson brackets were missing. The original motivation for the Hamiltonian formalism, 
namely to provide a basis for (“canonical”) quantum gravity, remains to be fulfilled,**> but even 
at the classical level it is very useful-though not indispensible-for treating boundary terms, 
symmetries, and conserved quantities (see below in this section as well as §8.9). 

As in the previous 3 + 1 first-order version of the initial value problem of GR, also in the 
Hamiltonian formalism the role of general covariance in the original equations (or in the Einstein— 
Hilbert action) is replaced by the freedom of choosing a foliation of our space-time M, and once 
again this freedom is parametrized by the freely specifiable lapse and shift functions. Since 
the Hamiltonian equations turn out to be (first-order) hyperbolic and hence deterministic, all 
arbitrariness in the solution lies in the choice of the lapse and the shift (and hence of the foliation). 

Thus we have a time function t and ensuing foliation (8.1), where &, = & for a single 
3d-manifold &, and each hypersurface %, is assumed to be spacelike in M. The action S(g), 
from which the Hamiltonian will be derived, is defined on V C M, for which we assume that 


V = Ube frie Er L= ENV, (8.194) 
so that, as a shadow of the factorization M = & x R, we have 
VS x [ttr], (8.195) 


where X' = X/. If X and hence each %, is closed (= compact without boundary) we assume that 
X = %,; otherwise (think of £ S R°), % C È; is a compact submanifold with boundary 


S, := 0) (8.196) 


in X; (think of £} & B}, the closed 3-ball in R3 with radius r, so that S; Z 0B} = S2, the 2-sphere 
in RÌ with radius r). This means that the boundary OV of V decomposes as 


OV = Lr U Zr, UB; B = Use frt] St (8.197) 


which is a (hyper)cylinder bounded above and below by 3-manifolds },, and %,,, respectively, 
and bounded on the side by a 3-manifold B that in turn is foliated by the 2-manifolds S;. Using 


g=—-Ls: => VTE =Ly/S, (8.198) 


384 Pioneering papers include Dirac (1950, 1958ab), Bergmann (1949), Bergmann & Brunings (1949), and 
Arnowitt, Deser, & Misner (1962), whose approach is reviewed in Misner, Thorne, & Wheeler (1973), §21.6. See 
Salisbury (2020) for the history of canonical GR. Of the reviews in the theoretical physics literature we mention 
Poisson (2004), $4.2, and Sundermeyer (2014), chapter 7 and §C.3. The mathematics was done, in different ways, 
by e.g. Fischer & Marsden (1979), Kijowski & Tulczyjew (1979), and Isenberg & Nester (1980). Dirac’s approach, 
which is still used, involved a heavy “constraint algorithm”, which can be avoided as long as one realizes that the 
ultimate justification of any Hamiltonian formalism is that it simply reproduces the Einstein equations. 

385 See DeWitt (1967), Rovelli (2004), and Thiemann (2007). As for nuclear fusion, one begins to lose patience. 
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where g = detg and g = detg, which follows from (8.14), as well as (8.58), we may then rewrite 
the Einstein-Hilbert action (7.2) and the boundary term (7.36). This gives 


jaf a f d’yı/&(y) [L(R + Tr (k)? + Tr(R?)) —2( YZ Tr(k) +AL)]; (8.199) 


_ 3 5) u 3 z 7 
(g aA yı/|det($)|Tr (k,) [4 y/|det(&)| Tr (k) 
tf A 
-2 f at | a diata == 8.200 
A | det(g)| Tr (k)) ( ) 


where g; and k, are the induced 3-metric and the extrinsic curvature on ©, with regard to its 
embedding &, > V C M, respectively, and likewise ki is the extrinsic curvature on B —> V. The 
dots in (8.200) mean that for the moment we omit the X? terms in (7.36), which will be reinstalled 
at the end of the calculation. The last term in (8.199) is also a boundary term, since 


| @y 4/8 (y)AL= [axa &6)ViL) = [ee vu. (8.201) 

X, S; 

similarly to (7.17) - (7.18), but now in 3d. Using (8.10), the penultimate term in (8.199) equals 
Le Tr (k) = eoTr(k) = LN” 9u Tr (k) = L(V(Tr(k)N#) — Tr(k)V aN”). (8.202) 


The first term gives a boundary term that cancels the first two terms in (8.200). The second is 
VaN! = -Tr (Ñ), (8.203) 


as follows from (8.40), in which Oy is spatial, so that N" Oy = 0. Finally, (8.201), which came 
from the bulk action (8.199), and the last term in the boundary action (8.200) neatly combine to 


[ee der) GFL + Tr (È) = f dz ,/|det(3;) LTr (K), (8.204) 


where n is the outward normal vector into &, of the embedding S; > %,, g; is the induced metric 
on the 2-manifold S,, and k; is the extrinsic curvature on %,, with respect to the embedding 
Sto pe Combining all, and reinserting the constant k? terms in (7.36), gives the total action 


S(g) = Sc(g) + Sa(g =" dt (J. dy /det(&(y))L- (R+ Tr (K) — Tr (&)*) 
=) | d’zı /det($&(z)) L (Tr (ki) nan (8.205) 


where k® is the extrinsic curvature of the embedding of the surface S, in RÌ, seen as the x? = t 

slice of Minkowski space-time. We repeat that this term is necessary for convergence if %; 

approaches %, in case that & is not compact (if & is closed all boundary terms may be ignored). 
In order to pass from a Lagrangian to a Hamiltonian description, we write, with x = (y,t), 


is 
Sole) = f° dt Zola): (8.206) 


Le) = y O) L(x) (R(x) + Te (Ki (y)”) -Tr (Ke ())”), (8.207) 


386See Poisson (2004), §4.2.5, for a calculation yielding (8.204); his book contains many computations omitted 
here. This may also be inferred by choosing L = 1 for the moment and noting that (8.204) is a geometric expression. 
Note that Poisson (and many physicists) defines the extrinsic curvature as minus ours (the math convention). 
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in which k (and hence the quantities Tr(k) and Tr (k?) derived from it) is determined by (8.22). 
If we use (8.64) in the latter expression, in terms of spatial indices, we obtain 


ij = IL! (Bin V jS* + 8 ViS* — gij), (8.208) 


which allows us to compute the canonical momenta for the 3-metric g;;, as in pj = OL/ Og’, viz. 


T = OF _ Stk a — K). (8.209) 
98;j 


Thus the k;; will be seen as functions of g;; and # by inverting (8.209), which yields 


V &kij = iTr (A); — fij, (8.210) 
where Tr (#) = &)7* and 7; j = Erd jit! . One also uses the ‘de-densitized’ momentum?*’ 
= = t;;/\/% = Tr (k (k) Bi; — kij, (8.211) 
so that 
kij = IT (3%) 8:3 Kr. (8.212) 


Note that the boundary action Sg(g) does not contain g;; and hence makes no contribution to rt, 
Furthermore, since neither Sg nor Sg contains the time derivatives È and S' of the lapse and the 
shift, the corresponding momenta vanish and may be ignored. The canonical Hamiltonian 


H(pi.g‘) E? pig’ — L(g, g), (8.213) 


where the g’ are to be eliminated in favour of their conjugate momenta, may then be computed as 
usual. For GR this gives two terms, coming from the bulk and the boundary Lagrangians. First, 


H! = | & uo in n= i fe HL(y); 8.214 
c= (4) cO) ns cl) et cO) ( ) 
HG(#", gij) = #4 digi - Lo( Bij, alij), (8.215) 


where in the spirit of the Hamiltonian formalism (called “geometrodynamics” in this regard), 
we have replaced &, by & and hence regard g;; and 7;; as (geometric) quantities defined on X. 
However, although Sz does not contribute to the definition of the momenta, it plays the role of a 
potential energy, and as such should be (negatively) added to the total Hamiltonian, in that 


Ane = AE) Wane (8.216) 
y= d*zy/det(;(z)) L (Tr (X) — Tr (k°)), (8.217) 


cf. (8.205). We wrote primes here, because, like the original bulk action Sg, the bulk Hamiltonian 
Hg, in fact contains divergences leading to boundary terms. Indeed, if we solve (8.208) for 0,8:j, 


387 As before, define a volume @ € Ore xX) by @ = VER) B(x )dx! A dx* \ dx. Geometrically, the canonical mo- 
mentum 7 conjugate to the spatial metric g should be regarded as an element of x(02) (£)® Q? (£), on which 
interpretation (8.209) should be written as # = (Tr (k)g —k)* @ @, or # = YE(Tr (k) 8) — Ki dx! A dx? A dx’. 
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and substitute this in (8.215), through a partial integration and Stokes’s theorem (on È, i.e. in 
3d) one may replace the terms involving V ;S« and V;S* by terms linear in S*. This yields 


H(2£) = A¢(Z)+Hp(Z) = slim, He(2’) + Hp(2'); (8.218) 


HE) = f dy y/aer(a(y)) Hols): (8.219) 
g(x’) = Lo d’z\/det($(z))Hg(z); (8.220) 


where the true bulk and boundary Hamiltonian densities are given by 


Hg = LCo + S'C;; (8.221) 
Hpg = 2(L (Tr (K) —Tr(k°)) + S'n žy). (8.222) 


These, in turn, are defined in terms of the familiar expressions, cf. (7.148) - (7.149), 


Co = —R + Tr (k*) - Tr (k)? = —R + Tr (#7) - 1T7 (2); (8.223) 
Ci = —2(V jk! -V/Tr(k)) = —2V ži. (8.224) 


Here the lapse and the shift are (freely) given functions of space and time, whereas the canonical 
quantities (8;j,%) or, equivalently, (£;,, X1), which are initially defined on &, evolve according 
to the Hamiltonian equations of motion, and as such become time-dependent (what this “time” 
means becomes clear only when the total globally hyperbolic space-time (M, g) plus its foliation 
dictated by the solution is reconstructed). If £ is compact one may put %’ = % and forget about 
the boundary terms (i.e. 9% = 0X = 0). If È is non-compact and each approximant U’ C X is 
compact, the metric and extrinsic curvature (%,k) of OX’ seen as embedded in X’ are determined 
by (8,7%) are hence are dependent variables. Moreover, since Co and C; are the Hamiltonian and 
momentum constraints (7.148) - (7.149), respectively, we need to put 


Co = 0; Gel: (8.225) 


We may simply do this “by hand”, since as mentioned before, the ultimate justification of any 
Hamiltonian formulation should lie in its equivalence with the original Lagrangian formulation 
of the problem.*** One could also treat (L, SÌ) as canonical variables, and notice that, because 
the action (8.199) - (8.200) does not contain their time derivatives, the associated canonical 
momenta vanish. Using the Hamiltonian (8.218), from the Hamiltonian equations of motion 


gi = 0H/ dpi: pi = —OH/dq' (8.226) 


we therefore obtain something like 0H /0L = —ft, = 0. This gives Co = 0, noting that the 

variation of L (like that of g;;, but not that of 7;;) is supposed to vanish on the boundary, so that 

Hz(%') makes no contribution to the equations of motion (although it is a crucial part of the 

Hamiltonian itself, as we shall see). Similarly, 0H / 0s'=0 gives the spatial constraint C; = 0. 
The real equations of motion come from (8.226) applied to g;; and #, as follows.’ 


388]f (Z, g) is asymptotically flat, flor L = 1 and Sİ = 0 the boundary Hamiltonian Hg (=), which on the solution 
to the constraints (8.225) is the Hamiltonian, equals the total mass (8.126). See Poisson (2004), Problem 4.6.7 
389 All boundary terms cancel, as in the Lagrangian approach, so one may as well ignore them here. 
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e Using (8.212), the equation 0,8;; = OH /O') may be shown to coincide with (8.22), i.e., 


d i j X Y\z as 
z = 21 (it; — 1Tr (Ž) Bij) +-LoBip. (8.227) 
e The equation æ = -OH /08; j takes slightly more effort to make explicit; the result is 
Ons os xii Er 3 en ee ae 
= LT (at) — 28] + (Tr (a) — TER) BG) VB 
+ (VIVIL— EV VL) VE+ Bott", (8.228) 


where Gj; = Rj; — 4;,R is the 3d Einstein tensor defined by g. Eq. (8.228) is equivalent 
to (8.59), and so the pair (8.227) - (8.228) is equivalent to the pair (8.61) - (8.62). 


e In particular, for lapse L = 1 and shift S? = 0 one obtains the Einstein flow equations 


I Ay Te (8 (8.229) 
1 ont Awe, ae a T 
Je at = Tr (2) a — 2H He + 3 (Tr (4°) — Tr (A) N — G”. (8.230) 


The structure of (8.221) may be further clarified by comparison with electromagnetism (cf. 


§7.4). In electrodynamics (for simplicity in Minkowski space-time), the Lagrangian density is 
L=-lFyyF'", (8.231) 
cf. (7.83), seen as a functional of A,. The canonical momentum 
nm =dL/dAo (8.232) 
conjugate to Ag vanishes, since Y does not contain Ag. The one conjugate to A; equals 
T =02/0A;=F" =E, (8.233) 
in terms of which the Hamiltonian is 


H = [e@x-E -A—£(Ao,A,0,A)) = [ex@E-E +1B-B+AoV-E), (8.234) 


where B = V x A, and the term relevant to us, AoV - E, comes from partially integrating —E - VAo. 


Thus Ao is like the lapse and its equation of motion gives the (Gauss law) constraint 
V-E=0. (8.235) 
The equation of motion for Aj, i.e. A; = OH /On' = —0H/0E' = —E;+ VAo gives 
0B = -V x É, (8.236) 
whereas the one for 7’, i.e., Ét = —2' = ðH / ðA;, yields 
JE = V x B. (8.237) 
Maxwell’s equations (in vacuo) are completed by the constraint (8.235), and the identity 
V-B=0, (8.238) 


which follows from the definition B = V x A. 
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8.8 Constraints and deformation algebra 


In the 1970s an interesting perspective on the constraints (8.65) - (8.67) arose,””’ which was 
believed to be relevant for (canonical) quantum gravity (a hope unfulfilled so far), but which is 
also crying out for further mathematical and other conceptual understanding in classical GR.°”! 

We return to the setting of $8.1. Let Fol(%,M,g) be the “space” of all spacelike foliations 
F of some globally hyperbolic space-time M, as defined in §8.1, and let Emb(%,M,g) be the 
“space” of spacelike embeddings 1 :% > M. Then each foliation F defines a curve 


c:R— Emb(%,M,g); t> G (8.239) 


in Emb(%,M,g) via c;(x) = F (t,x). Although curves in Emb(%,M,g) do not necessarily 
describe foliations of M (just think of the constant curve), it is interesting to study tangent vectors 
to curves that do, and regard these vectors as “infinitesimal” foliations. Take a tangent vector 


d 
X’ € T.Emb(=,M,g) X = at =0), (8.240) 


for some curve t +> c; in Emb(%,M ‚g) with co = 1. Given such a vector X '_ we define 


_ dc, (x) 


X:ı(&) >TM; X(1(x)): Zt 


(t =0), (8.241) 
so that X(1(x)) = X, (x) € TiM. The bijective correspondence X’ <> X gives an isomorphism 
T,Emb(2,M,g) =T (U(X), T,x)M) (8.242) 


of vector spaces, where T,()M is the restriction of TM to 1(X) CM, seen as a vector bundle 
over 1(%). We may even remove the dependence on 1 by further identifying 


T(1(Z),Tyxy)M) =C*(L)@X(Z);_ > = TH (Emb(Z,M,g)) =C°(£)© (£), (8.243) 


where the right-hand side is seen as a vector space (on this understanding one may write x instead 
of $). Namely, one has a unique future-directed normal vector field N to 1(%), normalized to 


g(N,N) = -1, (8.244) 


as usual. Then decompose X = LN +S, as in (8.5), with the difference that so far the lapse L 
and the shift § are defined on 1(X) alone (see below for their extension to M). 

Before proceeding, let us first sketch a simpler situation that is well understood and which 
one, with limited success so far, would like to mimic in GR. If we replace Emb(%,M,g) by the 
diffeomorphism group Diff(X) of È (say by replacing M ~ X), the analogue of (8.243) is 


TyDiff(Z) S X(Z), (8.245) 


390 Key papers include Teitelboim (1973), Hojman, Kuchar, & Teitelboim (1976), Kuchar (1976), and Isham 
& Kuchar (1985ab). See also Anderson (2007), Gomes & Shyam (2016), and Gomes & Butterfield (2020) and 
references therein to later literature. A mathematically rigorous version of the Poisson brackets involved in this 
analysis, and more generally of the entire Hamiltonian approach to GR, including the PDE side, was simultaneously 
and independently developed in a series of papers culminating in the review by Fischer & Marsden (1979). All of 
this has so far only been done for compact Cauchy surfaces %, so we assume this. See also Proposition 6.19. 

391 See Blohmann, Barbosa Fernandes, & Weinstein (2013), Bojowald et al. (2016), Blohmann & Weinstein 
(2018), and Gtowacki (2019), all based on Lie algebroids. We refrain from specifying topologies and smoothness; 
the best setting seems to be diffeology (Iglesias-Zemmour, 2013; van der Schaaf, 2020), as in the papers just cited. 
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since vector fields integrate to one-parameter groups of diffeomorphisms. In particular, if we 
regard Diff(%) as an infinite-dimensional Lie group, and take y = idy to be the identity, then 
X(X) is the Lie algebra of Diff(), whose Lie bracket, however, is minus the commutator. 

For another way to look at this, let G be a Lie group and N some manifold. Recall that 
a G-action on N is a smooth map 9: G x N > N, written as P(Y,x) = @y(x) = yx, such that 
ex = x and g(hx) = (gh)x for all x € M and g,h € M. A G-action on N gives rise to a map 


; d p= 
9 : Lie(G) + X(N); BAT) = Se "ro. (8.246) 
where A € Lie(G), i.e. the Lie algebra of G, and f € C”(M). This map can be shown to be a Lie 
algebra homomorphism. Taking G = Diff(%) and N = &, with the defining action, we find 


Px: (XL) > KR); X > —-X, (8.247) 


whose minus sign is correct: as has just been noted, the Lie bracket on X(X) = Lie(Diff(X)) is 
minus the commutator. Towards (8.243), another relevant construction is to take N = V to bea 
vector space on which G acts linearly, and form the semidirect product V x G, with Lie bracket 


[(v.X), (w.¥)] = (Xw—Yv, |X, ¥]), (8.248) 


defined on the vector space Lie(V x G) = V @Lie(G), where Xw is the same as @,.(X )w as just 
defined, evaluated at Ty9V = Lie(V) S V. Take G = Diff(X) and N = C”(%), where Diff(X) 
acts on C”(%) by pullback of its defining action on X. The bracket (8.248) is then defined on 
C*(X) 6 X(Z). Writing Le C*(Z) and Se X(Z), eq. (8.248) becomes 


[(£1,81), (L2,$2)] = (4b - -2 La, [$1,82]), (8.249) 


where LL = SZ is the defining action of the vector field Son the function Z, and [5 l 52] = L5, 5) 
is the usual commutator of vector fields, all happening on &. However, returning to (8.243), 
consider the following closely related bracket on C” (£) @ X(X), seen as T,(Emb(%,M,g)): 


[(£1,81), 81,82) = AL - Ze, L1, [81,82] + Vn -LVb). (8.250) 


Here % has been endowed with a Riemannian metric g = 1"g, as in the initial value formulation, 
and the bracket depends on this metric through its divergence operator V := V,» g = Vz (which 
sends functions to vector fields).*?” In physics this bracket is called the deformation algebra. 

As we shall see, the Poisson bracket of the constraints in GR reproduces this algebra, and 
hence it would be nice to understand it better, for example by seeing it as a commutator. To this 
end, we note that Diff(M) acts on the space Emb(%,M) of all embeddings % > M via 


y(t) = wou, (8.251) 


but this does not restrict to an action on the space Emb(%,M ‚g) of all spacelike embeddings. On 
the other hand, if ı(%) is spacelike with respect to g, then yo is spacelike with respect to w*g, 
so that if y is an isometry and ı € Emb(%,M,g), then you € Emb(%,M,g), and hence one does 
have a well-defined action of the group Iso(M,g) of all isometries of (M,g) on Emb(%,M,g). 


392For a fixed 3d metric g this is not a Lie bracket, as the Jacobi identity may fail (Blohmann et al., 2013). The 
following construction may also be found in this paper, shadowed by an analogous discussion in coordinates in 
Bojowald et al. (2016). For these authors, this construction is just an introduction to the use of Lie algebroids. 
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One can do more, however, at least locally, in the sense of validity within some open nbhd U of 
1(X) in M, for some fixed ı € Emb(%,M,g) at which we explore the tangent space (8.243). As 
in making the identification (8.243), fix some spacelike embedding 1 :% — M with fd normal N, 
so far just defined on 1 (£). By (for example) the tubular neighbourhood theorem of differential 
geometry and the local properties of the exponential map, there exists an open nbhd 


USIxXE (8.252) 


of 1(}Ł) in M, where 0 € 7 C R is an open interval, such that the timelike geodesics with tangent 
N at 1(X) (and hence normal to ı(%)) do not cross within U. This gives a foliation 


U = Users, (8.253) 


where Xp = 1(X) and &, is the set of points We" (s), where x € Xo and yo 
yo? (0) = x and i? (0) = Ny. We call this local foliation, which is entirely determined by 1, 
canonical.*?° It has lapse L = 1 and shift S = 0. The normal N, so far defined on E, extends to 
a vector field N on U through parallel transport along these geodesics, i.e. by solving 


is the geodesic with 


VN = 0; (8.254) 


as we know, this preserves the normalization (8.244). The canonical foliation then arises by 
simply transporting Xo along the flow of N. 


Definition 8.11 A vector field X € X(U) is Gaussian iff one and hence each of the following 
equivalent conditions is satisfied:°?* 


1. £x(N,Y) =O for eachY € X(U), i.e. in£xg =. Equivalently, N” (V „Xy + VyXy) = 0. 
2. The flow w; of X preserves the canonical foliation near & just defined.*”° 


By definition, the first condition is equivalent to the following property of the flow: 


Sy; (x) (Wie, T. WY) = 8x(Nx Yx), (8.255) 


for each x € &. This implies that, as announced, the flow of a Gaussian vector field maps 
spacelike surfaces to spacelike surfaces, at least within U, i.e. for small enough t.*”° In that 
sense, Gaussian vector field reside somewhere between Killing vector fields and arbitrary ones. 


Proposition 8.12 Each vector field Š € T (Xo, Tz,M), where Xp := 1 (£), has a unique Gaussian 
extension X to U, which, if decomposed as X = LN +S, has lapse L and shift S satisfying 


LyL=0; LS = VL, (8.256) 


where VL = VL+g(N,VL)N is the spatial gradient of L, i.e. VEL = (g#Y NEN eye 
This is tangent to the leaves of the canonical foliation, so that g( ZyS,N) = g(VL,N) = 0. 


393]n general it cannot be extended to M since such geodesics may cross. We assume © is compact. 

3°4The name Gaussian comes from the fact that the flow y, also preserves the “Gaussian normal form” of the 
metric g = —dt? + &;(t,x)dx'dx!, which L = 1 and S = 0 imply, see eq. (8.14) and Proposition 8.1 in §8.1. 

395 That is, the leaves of the canonical foliation around y (Xo) are the images under y, of the £, in (8.253). 

396 Although TWN, may not equal Ny, (x) close to È it is still timelike. So if Yy € T,%o, so that gx.(N,,Yx) = 0, 
the vector T, y;Y; is orthogonal to some timelike vector and hence is spacelike, cf. Lemma 5.26 in O’ Neill (1983). 
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Proof. Using Cartan’s formula, the defining condition of a Gaussian vector field X becomes 
0=in Kg = (Lin + itn,x))8 = (dix +ixd)ing+ in,x]8 = d(ixing) + imx]8& (8.257) 
where ing is the 1-form g,yN"dx", and we used diyg = —ddt = 0. Eq. (8.257) is the same as 
IN,X] = -V(g(N,X)), (8.258) 


which in turn may be rearranged as Vys = VsN + VL. Taking normal and orthogonal components 
with respect to N and using (8.244), (8.254), as well as torsion-freeness, from which 


[NX] = VX —VxN = VyX —VsN, (8.259) 


gives (8.256). These are first-order PDEs for L and S with given initial values Z and $ on Xo. In 
coordinates (t,x) adapted to the canonical foliation they even simplify to 


ne “= VL. (8.260) 


The uniqueness claim then follows from the (elementary) theory of first-order PDEs. 


Corollary 8.13 Let 


2G = IUIN R Si X, =InN+ So (8.261) 


be the unique Gaussian extensions of vector fields X = liN + Ši and X = LN + S$) defined 
on ı(%). Then, referring to (8.250), the commutator |X1,X2] at 1(X) is given by (8.250), i.e., 


XiX] = (181), ($182). (8.262) 


The proof is a simple computation, using (8.256). In (other) words, the curious bracket 
(8.250) is just the commutator of the Gaussian vector fields obtained by extending the given vector 
fields ZINN + $; and Ž2N + S> on 1(Ł). Looking at a Gaussian vector field as an infinitesimal 
diffeomorphism of the special kind that preserves spacelike embeddings, this to some extent 
justifies calling (8.250) a “deformation algebra”, although the situation remains to be clarified. 


As already noted, the reason for studying this algebra is that in the Hamiltonian approach to GR 
the constraints reproduce it in the following sense: writing the total Hamiltonian H = H(%) as 


Hg) (8) := [ ,/det(g) (LCo(é,#) + SICE, 2), (8.263) 
see (8.218), (8.223) and (8.224), the canonical Poisson bracket will turn out to be 
{Hs A (i5,55) } = -H, 2,5), = AH l-4 llos] +, 0-0) (8.264) 


Writing Hy := H (Z.0) and Hg := H (0,5)? three interesting special cases of this bracket are 


{HLH} = -HL 61-101) (8.265) 
{H5 Hs, } = Hiss] = Hy, 3,3 (8.266) 
{H5 Hr} = Hy. (8.267) 


Of these, the first involves the metric and is generally seen as mysterious. In the next two 
sections we define a Poisson bracket for GR and try to explain the special status of the “super” 
Hamiltonian (8.263) in the light of the bracket (8.264) and the theory of the momentum map.°”’ 


397 


For another perspective see Głowacki (2021), who derives Definition 8.11 as a consistency condition between 
the 4d and 3 + 1 descriptions of the dynamics that, unlike Theorem 8.2, also holds off the constraint surface. 
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8.9 Poisson brackets, constraints, and momentum map 


The momentum map was originally introduced in the 1960s by Kostant and Souriau in the setting 
of symplectic geometry. It clarified especially the relationship between conserved quantities and 
symmetries, culminating in a Hamiltonian version of Noether’s theorem.°”® The simplest setting 
for the momentum map, however, is in Poisson geometry, where the Poisson bracket is not seen 
as a concept derived from the symplectic structure, but stands on its own.” 


Definition 8.14 A Poisson bracket on a manifold P is a Lie bracket {—, —} on the (real) vector 
space C*(P), such that for each h € C*(P) the map 


Xn: f {hf} (8.268) 


defines a vector field on P, called the Hamiltonian vector field of h. A manifold P equipped 
with a Poisson bracket is called a Poisson manifold, and (C° (P),{, }) is a Poisson algebra. 


Unfolding, we have an bilinear map {—,— } : C” (P) x C” (P) — C” (P) that satisfies 


{g,f} =—{f.a}; (8.269) 
Lie AAA + {8 {h,f}} = 0; (8.270) 
{feh} = {F,g8}h + stfıh}, (8.271) 


where (8.269) - (8.270) is the Lie bracket property and (8.271) is the Leibniz rule for derivations. 
The flow w, of X; is the motion generated by h, seen as “the Hamiltonian”. Hence if (x) are 
coordinates on P, and we write x(t) for y;(x), then x(t) solves the coupled first-order ODEs 


= {h,x'}(x(t)). (8.272) 


The following result is crucial, although its proof is a straightforward exercise: 
Proposition 8.15 A Poisson bracket on P defines a Lie algebra homomorphism 
C” (P) > X(P); hr Xp. (8.273) 
In particular, for any f,g € C” (P) we have 
[X/.Xe| = Xr fe} (8.274) 


The oldest example of a Poisson manifold is P = R?” (even n = 1 is interesting!), where 


= «(9f dg df og 
{hs} = L (52 dd ogi ak 


j=! 


(8.275) 


In that case, the Hamiltonian vector field of h is obviously given by 


n oh ð oh oa ) 
X, = ein 8.276 
. L (> og’ Oqi Op; 


398 See Souriau (1969), Kostant (1970), Kijowski & Tulczyjew (1979), Guillemin & Sternberg (1982), Abraham 
& Marsden (1985), Marsden & Ratiu (1999), and Ortega & Ratiu (2004). 

3991n this approach symplectic geometry is a special case of Poisson geometry, arising when the Poisson tensor II 
is invertible. The symplectic form @ is then the inverse of II and the Poisson bracket equals {f,g} = @(Xy,Xg}. 
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Writing w (p,q) = (p(t),g(t)), we see that this flow is given by Hamilton’s equations: 


_Əh(p(t),a(t)). 


gy EP} l)a) ah (8.277) 
dgi(t) an(p(t),a(t)) 
= = ee 27 
q eg Spt). a(@)) 77 (8.278) 
A different kind of example is P = IR’, which is odd-dimensional, where we define 
{f8} (xyz) = 
dfdg oafdg | ofog ofdg | dfdg ofdg (8.279) 
i dy dz dzay 020x 9x09z Ox dy ðyðxj` 

This is a special case of a general construction. Let g be a Lie algebra, with basis (Ta), so that 

[Ta, Tp] = LG ples (8.280) 


for certain structure constants CS. We write O in the dual vector space g* as 0 = )', 0,0", 
where (@,) is the dual basis to a chosen basis of g, i.e., a" (Tp) = ôf. In terms of these 
coordinates, the Lie-Poisson bracket on C” (g*) is defined by the formula 


RIO (8.281 


Without a basis of g, the Lie—Poisson bracket may also be defined by extending the formula 
{A,B} = [A,B], (8.282) 


where A,B € g and A € C*(g*) is the evaluation map A(@) = (A). 

We now turn to the momentum map, which generalizes momentum, angular momentum, and 
almost every other quantity related to symmetry and conservation laws, culminating in Noether’s 
Theorem. First, independently of Lie groups, Lie algebras also “act” on manifolds: 


Definition 8.16 Let g be a Lie algebra and P a manifold. A g-action on P is a Lie algebra 
homomorphism from g to X(P), written A ++ &,, so that in particular, 


[éa 65] = Šias]: (8.283) 
If g = Lie(G), then such actions usually arise from G-actions via (8.246), i.e. &4 = (A). 


Definition 8.17 A momentum map for a Lie algebra action on a Poisson manifold P is a map 
J:P—> g“, (8.284) 
such that, defining Ja : P + R by Ja (x) = (J (x), A) =J (x) (A), for each A € g we have 
Sh Sy sr (8.285) 
A Lie algebra action with momentum map is called Hamiltonian. 


In words, for any A € g, taking the Poisson bracket with the function J, generates the flow in P 
obtained by acting on P with the one-parameter subgroup s ++ exp(—sA) of G. Noether’s (first) 
theorem then gives the familiar link between symmetries and conserved quantities: 
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Theorem 8.18 Let P be a Poisson manifold with an action of a connected Lie group G, whose 
associated g-action (8.246) has a momentum map J : P— g*. If h € C”(P) is G-invariant, i.e. 


h(y-x) = h(x) (8.286) 


for each y € Gand x € X, then for each A € g, the function J, is constant along the flow w; of 
Xn. That is, for any x € P and any t € R for which y(x) is defined, we have 


Ja (yı (x)) = Ja (x). (8.287) 


Proof. Using all assumptions, as well as the definition of a flow, we compute: 


d 2 
que) = X,Ja(W;(x)) (yis flow of Xp) 
= {h,J4}(W(x)) (definition of X,) 
= —{J4,h}(y;(x)) (antisymmetry of bracket) 
= X,h(y(x)) (definition of X,, ) 
= —f4h(wi(x)) (8.285) 
d 
= He yi(#))js=0 (8.246) 
s 
= - hy (x) )|s=0 G-invariance of h 
=0. 
A simple example is 
P=Rf =R? xR’, (8.288) 


with coordinates x = (9,9), where 5 = (p1, p2, p2) and Gg = (q',q’.q°), equipped with the 
“canonical” Poisson bracket (8.275). 


e Let G = RÊ (as an additive group) act on P by 
(a,b) - (9,4) = (#+ā,4 +b). (8.289) 
Then the derived g-action has a momentum map: identifying g = g* = R6, this is 
J(B.d) = (4, —P), (8.290) 
and if the (sub)group G = R? acts on P by 
b: (5,4) > (B, +b), (8.291) 


we simply have 
J(B.4) = —P. (8.292) 


The minus sign is of course unfortunate, but repairing this gives other undesirable signs 
elsewhere. By Noether’s Theorem, if the potential V in a Hamiltonian 


h(p.q) = p? /2m +V (q) (8.293) 
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is translation-invariant, then momentum 7 is conserved. Similarly, if G = SO(3) acts on 
the same phase space R® by 


R- (p,q) = (RB.RQ), (8.294) 
then the derived g-action has a momentum map, which, identifying so(3)* = RÌ, equals 
J(B,4) = —4x P, (8.295) 


which is (minus) the angular momentum! This time, if in the above Hamiltonian the 
potential V is rotation-invariant, Noether makes angular momentum g x p conserved. 


e Now we keep G = SO(3) but change P to R? with the Poisson bracket (8.279) and take 
the defining action of G = SO(3). If again we identify so(3)* = R°, this action has a 
momentum map J : R? — R3, given by 


IZ) x. (8.296) 
More generally, the momentum map for the coadjoint action of G on g*, with Poisson bracket 
(8.282), is simply the identity map g* — g*, i.e., 
J(0) = 9; > Ja =A. (8.297) 


As a crucial point, it would be natural to expect that a momentum map J, if it exists, satisfies 


{a-JB} = Jta.s] (8.298) 


for all A,B € g. This property holds in our examples so far except the first, even on P = IR: 


since G = R? is abelian we have [A,B] = 0 and hence JIA.B] = 0, but in a suitable basis (e), e2) 


of g = IR? we have Je, = q and Je, = —p, so that {J,,,Je, } = 1, i.e. the unit function on R°. 


However, one may always restore (8.298) by passing to a suitable central extension G (and g); in 
the case at hand this is the (3d) Heisenberg group (see the references in footnote 398). 


We now try to understand the Poisson bracket (8.264) of canonical GR in this light. We first 
define the bracket (and the underlying phase space) in question. Fixing È, we write Z = #(%) 
for the space of smooth 3d Riemannian metrics on X&. The associated tangent bundle is 


TRZEBXPSr»CL2xX So, (8.299) 


where f = T (2:0) (3) is the vector space of all covariant 2-tensors t;; on X. Similarly, the 
associated cotangent bundle may be written in (cartesian) product form as 


TRZ RX CPX Si, (8.300) 


where ./7 = ro) (X) is the vector space of all contravariant 2-densities d;; on 2.*0! Then 


a i ndt, (8.301) 
2% 


40]f G is connected and the given g-action on P comes from a G-action via (8.246), then (8.298) holds iff the 
G-action is equivariant with respect to the coadjoint action on g* (i.e. the dual to the adjoint action of G on g). 

401 Assuming orientable, these are 3-form valued covariant 2-tensors, or equivalently tensor products of covariant 
2-tensors and 3-forms, cf. $7.1, so that after contraction of the indices they can be integrated over X. In coordinates 
one may assume dx! A^ dx” \ dx? as a standard volume form and hence fr = 2 but not canonically so. Strictly 
speaking, in formulae like (8.209) one should then write \/gd>x instead of \/Z. 
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where tij € Tg# and d'! € T}.%, defines a pairing between .% and .7."°" Writing elements of 
the cotangent bundle 7" as (g, 7), we also have 


TAT R= Ax Fj. (8.302) 


We may turn 7*Z into a Poisson manifold by generalizing the canonical bracket (8.275) to 


_ ôf ög 6g Of 
es = a ( Sa 54, oa st). (8.303) 


where the 6 are functional derivatives. This informal expression should be made precise. We 
only (need to) consider functions f : T*Z% — R of the form f = Js F, in which 


F :T*2 > CF; C? = C (£) := C (£) 9 A(X), (8.304) 


i.e. the space of density-valued smooth functions on È, so that the integral J, F is defined. Using 
(8.300), and assuming (typically Sobolev norm) topologies on all spaces involved for taking 
limits, functions F defined as in (8.304) then have partial Fréchet derivatives 


Daf(&R): SG > CF; Dif (8): A> C, (8.305) 
defined by 
aa . F(g,t+tp)—F(g,7 
Daf (8 ñ) (p) := lim (8 2 (8 ), (8.306) 
m  F(&+thR)—-F(&,R 
Daf (8,7) (h) := lim 3 : (8 ) (8.307) 


etc. Writing C” for C*(X) as is customary in this business, these maps have (smooth) duals 
Dg f(E,8)" : C > A; Df (8, #)* :C” > SH, (8.308) 


respectively, with respect to the natural L? pairing, cf. (8.301). Following Fischer and Marsden, 
the Poisson bracket of f = J; F and g = J; Gon T*Z is then rigorously defined by 


{f.8}(8 ñ) = [DaF (8-07 (12).236(8.2)°(12)). (8.309) 


With respect to this Poisson bracket, lengthy computations recover (8.265) - (8.267), where 
(L, Š) are the values of the lapse and shift (L, S) at any fixed time and do not necessarily come 
from a Gaussian vector field. Furthermore, analogous computations bring the Hamiltonian 
equations of motion (8.227) - (8.228) in the Poisson bracket form given by (8.277) - (8.278), i.e. 
5, = Hasy}: o arh (8.310) 


where we write H(z s) instead of H(z 3) in order to make clear that the lapse and shift (L, S) are 


arbitrary (as long as they come from a regular foliation, if only a local one).*” It is crucial that 
the equivalence between (8.227) - (8.228) and (8.310) holds “off-shell”, i.e. whether or not the 
constraints are valid and hence whether or not the ensuing space-time metric g is Ricci-flat. 


402 Since we are at g € Z from which the standard integration with respect to Jad 3x is defined, one may similarly 
pair .% with the space 4? = T°?) (£) of “ordinary” contravariant 2-densities. 

403 These computations are done (though never fully) in Arnowitt, Deser, & Misner (1962), DeWitt (1967), Misner, 
Thorne, & Wheeler (1973), §21.6, Fischer & Marsden (1979), Poisson (2004), §4.2, and Thiemann (2007), §1.5. 
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8.10 A momentum map for canonical general relativity? 


The combination of (8.225), (8.250), (8.298), and (8.310) makes it attractive to regard the 
Hamiltonian (8.263) as amomentum map of some kind. The point is not just that the various 
Lie and Poisson brackets match, *'* but also that the role of the lapse and the shift (L,S ), which 
appear as parameters in the Hamiltonian, is now clearly distinguished from the role of (g, 7): 


e The point (8,7) € T* is simply the argument x of H = Jı(x); 
e The lapse and shift (L,S) play the role of the label A € g, cf. Definition 8.17. 


The original idea of Fischer and Marsden to do so was as follows.*°> With Poisson manifold 
P=T’%, (8.311) 


take (Žo, žo) € P, and some MGHD (M,g,1) of the associated initial data (X, ĝo, ko). We relabel 1 
as tọ since it will act as a “reference embedding”; by definition (M,g, 19) induces the initial data 
(g0,ko) on w(%) C M. Then ı € Emb(%,M,g) sends (ĝo, žo) to the point (g, #) € P obtained 
from the data induced by (M,g,1) on ı(%). As we have seen, tangent vectors to curves in 
Emb(,M,g) may be identified with pairs (L,$) € C” (£) x (XZ), and if we agree that these 
pairs form something like a Lie algebra g of Emb(%,M,g), then these pairs will be labels A in 
Ja, as advertised above. Because of (8.310), the Hamiltonian is a momentum map for this Lie 
algebra, and because of (8.264) this momentum map even satisfies the pleasant relation (8.298). 
Elegant as it is, this idea is questionable in (at least) two different ways: 


1. The thing that acts, i.e. Emb(%,M,g), depends on the point (£o, ño) of P at which the 
action is supposed to be defined.*° To repair this, we will have to use (Lie) groupoids. 


2. The MGHD (M,g,ı) is only defined when (go, žo) satisfies the constraints, and even so 
it is only defined up to isometry, see Theorem 7.10. Both problems can be addressed by 
refraining from the use of the MGHD, and even from only working with solutions to the 
(vacuum) Einstein equations. However, the constructions will then merely be local.407 


We fix & and only consider space-times (M, g) that can be obtained from some (8,7) € T*# 
by solving the coupled evolution equations (8.227) - (8.228) with lapse L = 1 and shift S = 0; 
this fixes some representative in the isometry class of (M ; g). This can, in general, only be done 
locally in time, but since we will quickly pass to an infinitesimal level this is no problem; the 
entire Lie groupoid construction may merely be seen as motivation for the ensuing Lie algebroid 
construction. We may therefore assume that M = I x %, where 0 € J C R is some open interval. 


404 The minus sign in (8.264) and hence the corresponding minus signs in (8.265) - (8.267) are caused by the 
fact that, as mentioned after (8.245), the Lie bracket in X(M) seen as Lie(Diff(M)) is minus the commutator, and 
likewise for the Gaussian vector fields and for X(%), so that (8.298) is correctly reproduced if we regard H as a 
momentum map J. Many authors, including Fischer & Marsden (1979), have the opposite sign for (8.268) as well 
as for the canonical Poisson bracket (8.275), in which case (8.263) - (8.267) has no minus signs, but (8.298) does. 

405See Fischer & Marsden (1979), §4.6. The follow-up papers they referred to for details never appeared. 

406 This part of the construction might be justified in that the remaining steps, reviewed below, only depend on the 
orbits of the “action”, rather than on the specific mathematical object that “acts” and causes these orbits. 

407The ideas below are preliminary. For a different approach, so far also “work in progress” (though more 
advanced), see Blohmann, Barbosa Fernandes, & Weinstein (2013) and Blohmann & Weinstein (2018). For attacks 
on the problem based on (multi-)symplectic geometry rather than Lie groupoids see Kijowski & Tulczyjew (1979), 
Lee & Wald (1990), the legendary GIMMsy project (Gotay et al., 1998-2004), and Forger & Romero (2005). 
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One obtains a solution to the vacuum Einstein equations in this way only if (8,7) satisfies the 
constraints, but this is not necessary at this stage (and would even jeopardize the construction). 
Let Go = Emb() consist of all triples (M,g,ı), where (M,g) is some space-time of the said 
type and 1 : X <> M is some spacelike embedding with respect to g. Let 


G = Move(%) := Emb(%) x Emb() (8.312) 
be the associated pair groupoid:“°° elements m € G “move” some 1; (%) to some 1,(%). Let 
p:T*& > Go = Emb(x); (&, 7) + (M, 8,1) (8.313) 


be given by taking (M,g) to be the space-time obtained by solving the evolution equations 
(8.227) - (8.228) with lapse L = 1 and shift S = 0, and take 1(x) = (0,x). Then G acts on the 
map p in the natural way described above, i.e., the action of ((Mj, 21,11), (Mo,g2,12)) € Gon 
(&,%) € T*Z is defined iff the triple (M2, 82,12) equals p(o, ão) = (M,g,1) as just described, 
in which case, as in Fischer-Marsden, the result is the pair (8,7%) induced by g; on tı (£) C My. 

With an appropriate smooth or diffeological structure, the Lie algebroid 7: g —> Go associated 
to G is the tangent bundle TEmb(%) — Emb(%), which, at fixed (M,g,1), we have studied in 
some detail above. Consequently, the given G-action on p: T*# — Emb(%) induces a g-action 


&:&X(Emb(%)) > X(T*&); Ar &,, (8.314) 


where, according to eq. (8.243) in §8.8, a vector field A on Emb() associates a Gaussian vector 
field (L,S), or equivalently its initial data (2,5) at 1(X), to a triple (M,g,1), cf. Proposition 8.12. 
Writing X = LN + S as before, and letting tildes denote the restrictions of the given quantities to 
1(X), the map & defining the g-action is then quite beautifully given by 


Ex: (8,8) > (Lee, AO): X =A(p(8,#)). (8.315) 


408 A groupoid is a small category with inverses, i.e. one has two sets G and Go (which in our case are infinite- 
dimensional manifolds whose smooth or diffeological structure remains to be developed, cf. footnote 391), with 
maps i : Go —> G (the unit), s,t : G —> Go (source and target), U : G XG, G — G (multiplication), where G x G G := 
{(x,y) € Gx G | s(x) = t(y)}, and J: G > G (inverse), such that, writing xy := u (x,y) whenever defined, we 
have s(xy) = s(y), t(xy) = t(x), tol = s, sol =t, (xy)z = x(yz), soi =toi=idg,, xi(s(x)) = i(t(x))x = x, 
I(x)x = i(s(x)), and x/(x) = i(t(x)). Thus G consists of arrows x sending s(x) € Go to t(x) € Go. For example, 
each equivalence relation ~ on some set Go, i.e. each subset R C Go x Go, defines a groupoid G = R with structure 
borrowed from the simplest example R = Go X Go, called the pair groupoid on Go, where s(a,b) = b, t(a,b) =a, 
i(a) = (a,a), I(a,b) = (b,a), and (a,b) - (b,c) = (a,c). The “opposite” example is a group, where Go = {e}. A 
groupoid G on Go may act not so much on a space but on a map p : P > Go, via a map @ : G XG P > P, where 
GXP :={(x,p) € Gx P| s(x) = p(p)}, subject to t(xp) = p(p), where we write xp = @(x,p), (xy)p =x(yp), 
and i(a)p = p, whenever defined. This is in fact the key to the use of groupoids in our GR context, since we see 
that, except when Go = {e}, only part of G acts on a given point p € P and this part may very well depend on p. 
In the presence of sufficient smoothness a groupoid-then called a Lie groupoid-has an associated Lie algebroid 
T : g —> Go, which is a vector bundle over Go, equipped with an additional map @ : g —> TGo (called the anchor) 
and a Lie bracket on T (g), the space of smooth sections of 7, such that [A, fB] = f[A,B] + a(A)f -B for each 
f € C” (Go), and a({A,B]) = [a(A),a(B)]. Here the simplest examples are m : TGo — Go with trivial anchor, 
which arises as the Lie algebroid of the pair groupoid on Go, and the Lie algebra of a Lie group, seen here as a 
vector “bundle” on a point (and hence as a vector space). As in the Lie group case, one has a notion of a g-action on 
a map p : P — Go, which often comes from a G-action but is defined independently of such an origin. Thus we have 
a Lie algebra homomorphism € :T (g) + X(P), A > 4, such that E¢4 = (p* f )&a and €4(a*f) = a(A)f. 

See Mackenzie (2005) for a comprehensive treatment of Lie groupoids, Lie algebroids, and their actions. Moerdijk 
& Mrcun (2003) and perhaps Landsman (1998, $II.3) or (2017, §7.4, §C.16) provide concise introductions. 
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The equivalence between (8.227) - (8.228) and (8.310) implies that this g-action has our Hamil- 
tonian H as a momentum map.” For X = S we have “og = Lg and Zn = Zst, so that 
E degenerates to a map Ë : X(Z) > X(T*R), Es: (8,4) > (LB, Zeit), which is the map 
obtained from (8.246) by taking G = Diff(X), acting on N = T*& by pullback of its action 
0(&) = (gy ')*Zon Z. With g = X(X), it follows that $+ Ho 3 is a momentum map for E. 

In the general case we are back to Fischer and Marsden, who-and this may even have been 
their main goal-now invoke a powerful construction due to Marsden and Weinstein, namely 
symplectic reduction.*'° In its simplest version, a Lie group G acts on a symplectic manifold 
P (i.e. a Poisson manifold whose Poisson tensor is invertible) with momentum map J : P— g* 
satisfying (8.298). On suitable regularity assumptions,*!! the symplectic quotient 


P//G:=J \(0)/G (8.316) 


has a unique invertible Poisson tensor II whose associated symplectic form @ = II"! satisfies 


T7-1(0)-sp//G® = 7-1(9) 4p (8.317) 


where II is the given invertible Poisson tensor on P with inverse œ = II~!, and the notation 
is hopefully self-evident.*!* Under the stated regularity assumptions, P//G is a (symplectic) 
manifold, which in case G defines gauge symmetries is identified with the space of physical 
degrees of freedom.*!* Furthermore, at each x € J~!(0) the tangent space T,P decomposes as 


T.P = T,(J~'(0)) D R = T,(P//G) © T,.(Gx) © TR, (8.318) 


where T,R is any (linear) complement of 7,(J~!(0)) within 7,P (if one has a positive definite 
metric on 7,P, one may define such complements as orthogonal complements). Furthermore, 
since T,(Gx), i.e. the tangent space to the orbit through x, is a subspace of T;,(J~!(0)), the latter 
splits into 7,(Gx) and a complement thereof, which we may identify with 7,(P//G). 

Apart from problems arising from the non-validity of some of the technical assumptions that 
underwrite it (including the lack of a sufficiently developed smooth or diffeological framework 
so far), the above constructions, originally intended for finite-dimensional Lie groups G acting on 
finite-dimensional manifolds P, at least conceptually generalize to the infinite-dimensional phase 
space P = T*& and our infinite-dimensional (Lie) groupoid G = Move(%,M). For x = (8, 7) 
lying in the constraint surface H = 0 (in which case, we recall, the ensuing space-time (M ‚g) 
solves the vacuum Einstein equations), the orbit G - (8,77) by construction consists of all initial 
data obtained from all spacelike embeddings 1 : & — M for the given metric g (i.e. on M). 


40°This is meant in the simplest way here, as a map J : T*Z — g*. In the context of Lie groupoid actions there are 
various other, more refined notions of momentum maps, see e.g. Bos (2007) and Blohmann & Weinstein (2018). 

410 The original sources are Meyer (1973) and Marsden & Weinstein (1974). For a historical survey of symplectic 
reduction see Marsden & Weinstein (2001). See also the references in footnote 398, as well as Landsman (1998). 

41 These are that 0 € g* is a regular value of J and that G acts freely and properly at least on J~! (0). 

412 Poisson purists will find it preferable to write this construction in terms of the Poisson structure alone, but 
this only seems possible by appealing to the symplectic stratification theorem for Poisson manifolds, which also 
involves symplectic geometry. See e.g. Ortega & Ratiu (2004), §10.1. First, there is a unique Poisson bracket on 
P/G such that 1, ,p/g{f.8he/G = {5p,p/gf> Tp p/g8g} P. Second, J~!(0)/G is one of the symplectic leaves in 
P/G, with its associated Poisson structure (which by construction is symplectic) inherited from the one on P/G. 

413This is more or less the definition of a gauge symmetry! Even if G gives observable changes, the quotient P // G 
is useful for simplifying the equations of motion, provided these come from a G-invariant Hamiltonian h on P via 


(8.272), since there exists a unique Hamiltonian hon P//G such that, just like (8.317), Ti (eve! = Èa (0) pl 


from which the motion on P may be (re)constructed (in case of gauge symmetry, / is the physical Hamiltonian). 
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7) tangent to the G-orbit in (8.318), i.e. along the subspace 


Deformations of the initial data (g, 
T,(Gx) = Tyga (G- (&,8)), (8.319) 


therefore give rise to the same space-time (M,g), at least up to isometry.*!* Even though (g, 7) 
satisfies the constraints, as we assume, and hence lies in H~! (0), the term T,R in (8.318) consists 
of deformations of (8,7) off the constraint surface H = 0. These deformations lead to space- 
times not satisfying the Einstein equations and can be ignored. Finally, the term Tiz 7) (T*#//G) 
gives the direction of deformations of (g, 7) that lie within the constraint surface. These give rise 
to space-times that satisfy the vacuum Einstein equations but are non-isometric to the (isometric) 
space-time(s) with initial data (2, 8; it). If all this can be made to work globally, which of course 
is a big “if”, the “space of gravitational degrees of freedom” may then be identified with @/G, 
where € C T*& is the constraint set H = 0. Furthermore, if Einstein(M) is the space of all 
metrics g on M that arise as the MGHD of initial data (£, g, 7) satisfying the constraints (and 
hence the vacuum Einstein equations), at least for M = IR x & we should also have 


Einstein(M) /Diff(M) = @/G. (8.320) 


In defense of this canonical dream,*!° let us note that at least “the count is right”: at each x € & 
we a priori have 12 degrees of freedom (d.o.f.), since both g;; and 7%) (or, equivalently, k; j) are 
symmetric 3 x 3 matrices, having 6 independent components each. Four constraints reduce this 
number to 12 — 4 = 8, and four components of X = (L,S) further reduce this to 8 — 4 = 4, that is, 
the gravitational field has 2 physical d.o.f. per point (plus 2 associated momenta). This result had 
previously been derived in §8.5 on the basis of a linear approximation to the Einstein equations, 
which leads to the identification of these d.o.f. with the two helicity states of a massless helicity-2 
particle (i.e. the graviton). Instead, the approach here is geometric and non-perturbative.*!® 


Jerry Marsden (1942-2010) discussing the momentum map for GR with the author in 1999. 


4141t is in this sense that H is said to generate gauge transformations in GR. But it does not follow that moving 
initial data (X, g, ) in M is unobservable or otherwise unphysical! See §8.11 for further discussion. 

415 See Fischer & Marsden (1979), p. 207. The left-hand side was taken up by Fischer & Moncrief (1996, 1997). 

416The fact that the count of the d.o.f. of the gravitational field can be done in these two very different ways reflects 
a deep schism in the world of quantum gravity (Armas, 2021). The majority goes for string theory (a perturbative 
particle-physics based ideology), whereas a sizable minority prefers a non-perturbative geometric approach. This 
schism was already implicit in the almost simultaneous publication of the three masterpieces on GR by Weinberg 
(1972) on the one hand, and Misner, Thorne, & Wheeler (1973) and Hawking & Ellis (1973) on the other. 
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8.11 Epilogue: The problem of time 


Although the previous analysis is preliminary and non-rigorous, we expect that its conclusion is 
independent of the details, in the sense that any satisfactory analysis of the (gauge and non-gauge) 
degrees of freedom of GR should lead to the same general picture (perhaps this is what we mean 
by ‘satisfactory’). This enables us to put in our tuppence worth on the scarlet problem of time. 
The philosophical analysis of time is as old as philosophy itself, traditionally starting around 
500 BC with the opposition between Heraclitus, who famously maintained that everything 
constantly changed, and Parmenides, who felt that if not change, then at least time was an 


illusion.*!” Jumping to the twentieth century, in what follows we focus on the implications:*!® 


time => change => A-series => B-series, (*) 


where we use the standard terminology in the philosophy of time, introduced by McTaggart: 


e In the A-series, events are ordered in a time series that goes from the past to the present 
and moves on towards the future. This ordering assumes the existence of a “moving now” 
and as such describes what has been called manifest time, which is what we actually 
experience. With respect to the “now”, any event lies either in the past, or in the present, 
or in the future, and this status changes as time flows, that is, as the “now” moves on. 


e The B-series, on the other hand, merely orders events according to their relative position, 
which can be either that they are simultaneous, or that one is earlier or later than the other. 


In particular, there is no “now” in the B-series. One version of the “problem of time”, then, is the 
claim that modern physics gives us a B-series, whereas everyday experience gives us an A-series. 
In other words, physics fails to incorporate the “now” that dominates our perception of time.*!? 
However, physics is still supposed to be compatible with an A-series, whose existence is merely 
foreign to its language. This problem is soft compared to the radical claim we will now discuss: 


general relativity does not even provide a B-series (D) 


417This Pre-Socratic opposition between “becoming” and “being”, or “change” and “existence”, continued with 
Aristotle. This had disastrous consequences for mathematical physics. In his Metaphysics, Aristotle organized 
knowledge into something like a 2 x 2 matrix, where the axes are “changing/permanent” and “dependent/independent” 
(that is, of man). He put physics in the change & independent entry, whereas mathematics was supposed to be 
permanent & independent (the latter against Plato). See e.g. Gaukroger (2020). This classification held back the 
interaction between physics and mathematics for 2000 years, until initially Kepler and Galilei and subsequently 
Huygens and especially Newton recombined them and thus provided the basis for modern science. 

418These implication were all proposed by McTaggart (1908, 1927). See also Dainton (2010). The only implication 
that really counts for our technical discussion is “time = B-series”, or rather its contrapositive “no B-series > no 
time”, but the chain in (*) is convenient in order to frame the overall problem of time. The first implication goes 
back at least to Aristotle (Physics, Book Iv, chapter 11), see Shoemaker (1969) for a nice philosophical analysis. It 
would be denied by Newton (Rynasiewicz, 2014), but GR can deny it, too, as it admits static solutions (see §8.4). 
The point, however, is that according to the arguments reviewed and critiqued below GR admits no flow of time 
whether or not time requires change. Similarly, the second implication needs to be argued for, as McTaggart does at 
some length, but his target is the A-series, whose alleged incoherence allows him to disprove the existence of time. 
Instead, the argument in our main text concerns the B-series. It is remarkable that of the two great twentieth-century 
philosophical treatises about existence and time, both of which are hard-core specimens of “armchair” philosophy 
based on pure speculation, McTaggart (1921, 1927) has been very influential on discussions that are informed by 
modern science, whereas Heidegger (1927) has, rightly, been completely sidelined in the philosophy of science. 

#19 This is the version of the problem addressed in Callender (2017), whose opening sentences deserve to be quoted: 
‘Time is a big invisible thing that will kill you. For that reason alone, one might be curious about what it is.’ 
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In particular, the causal picture of space-time, based on the relations / or J, i.e. on the partial 
orderings x < y or x < y (cf. §5.3), is a hallucination (or, in more diplomatic parlance, it is part 
of the “manifest” image of GR, as opposed to its “scientific” image). In reality, or so it is claimed, 
there is just a “frozen” initial data set (X,&,k) whose development into a space-time (M,g) is 
unphysical. In other words, despite the fact that relative to some foliation M = LIX, (cf. chapter 
8) the initial data (£, g,k) appear to move into new and usually different data (X,, &;,k;), these 
data are merely different descriptions of the same physical situation. If true, there is no change, 
and hence, taking the contrapositive of (*), no time either. Was Parmenides right, after all? 
In the literature one finds the following two arguments for the timelessness of GR:*7° 


e Diffeomorphisms: Since the Einstein equations of GR are invariant under diffeomorphisms, 
even for given initial data its solutions are unique only up to diffeomorphisms. Therefore, 
in order to save determinism, the “observables” of the theory must be diffeomorphism- 
invariant, too. This excludes any explicit time-dependence of physical quantities. 


e Constrained Hamiltonian dynamics: In the Dirac-Bergmann approach to GR as a con- 
strained Hamiltonian system time evolution is generated by the Hamiltonian constraint, 
which according to his formalism generates gauge transformations. Once again, in the 
interest of saving determinism the effect of such transformations is deemed unphysical. 


The second argument is a “Hamiltonian shadow” of the first.*”! Both arguments are based on an 
interpretational move within a certain formalism that is not in fact implied by that formalism. 
In the Hamiltonian approach this move—which we question-interprets different canonical 
data on the same gauge orbit as physically indistinguishable. Although this is indeed the case 
in electrodynamics, in GR the situation is quite different. The real sense in which moving from 
the canonical data (ŝo, o) on & to time-evolved data (g(t), ã(t)) on the same % is a gauge 
transformation, is that both data sets give rise to—i.e. are initial data for-the same space-time 
(M,g). To clarify this matter, for the convenience of the reader we now rephrase Theorem 8.2, 
which may be seen as a corollary and reinterpretation of Theorem 7.10, in Hamiltonian form. 


#20 In connection with relativity, the philosophical analysis of time goes back to the special theory, see e.g. Cassirer 
(1921), Bergson (1922), Schlick (1922), Reichenbach (1928), of whom the latter two also involve the general theory. 
See e.g. Ryckman (2018) and Stuur (2019) for recent historical and philosophical analysis. The problem of time in 
GR as discussed here has its historical roots in Einstein’s hole argument (see §1.5) and the ensuing issue of general 
covariance (§1.10), but may more specifically be traced back to Bergmann (1958, 1961). From theoretical physics 
we cite the reviews by Isham (1992) and Kuchar (1992) as well as the monograph by Anderson (2017); see also 
Thiemann (2007). Defendants of claim (!) include Barbour (1999), Earman (2002), and Rovelli (2004). From 
the philosophical literature our views are closest to Butterfield (1984) and Healey (2002, 2004). See also Norton 
(2010), Pitts (2014), Gryb & Thébault (2016), Rovelli (2019), and Thébault (2021). Maudlin (2002) dismisses 
the Hamiltonian version of the problem of time in GR as ‘completely phony’, but this verdict is predicated on his 
erroneous claim, made even twice, that the initial value problem of GR ‘admits of a unique maximal solution’. See 
Theorem 7.10, whose lack of absolute uniqueness (replaced by uniqueness up to isometry) is nothing but Hilbert’s 
Cauchy-problem version of the hole argument and hence may be seen as the root of the problem of time in the PDE 
approach to GR. Indeed, what makes the problem of time look genuine (though solvable) is that it pops up in almost 
any formulation of GR. However, Maudlin’s discussion of the “diffeomorphism” version is actually quite good. 

421 One also sometimes finds a mixture of these arguments to the effect that the (formal) Hamiltonian H in GR 
generates (space-time) diffeomorphisms, but this is hard to make sense of. If H, taken to be (8.263), acts on phase 
space T*Z as defined in §8.9, then there simply is no notion of 4d diffeomorphisms. If what is meant is a rewriting 
of Theorem 8.2 in Hamiltonian form via (8.310), then, as explained in the main text, the lapse and shift have given 
values, and diffeomorphism invariance of the theory is broken by the ensuing foliation of space-time. In that setting, 
what remains of the idea that time evolution in GR is a diffeomorphism is that 0,g = £,,g is a Lie derivative-indeed 
an infinitesimal diffeomorphism!—, which is true for almost any quantity in almost any geometric theory. 
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Theorem 8.19 Let (M,g) be a globally hyperbolic space-time equipped with a foliation (8.1) 
by spacelike Cauchy surfaces %,, and associated lapse L and shift S. Let (&(t),#(t)) be the 
canonical data on &, induced by the 4-metric g on M via the-equivalent-data ($(t),k(t)) 
consisting of the 3-metric and exterior curvature on X, induced by g, cf. (8.209) - (8.210). 


e Then g is a solution of the vacuum Einstein equations iff, cf. (8.223) - (8.224): 


1. For some t the pair (&(t), #(t)) satisfies the constraint equations Cy = 0 and C; = 0; 


2. The maps t ++ $ij(t) and t ++ 7;;(t) satisfy the evolution equations (8.310), where 
the Hamiltonian H15) indexed by the lapse L and shift S, is given by (8.263). 


e Conversely, given canonical data (%,7%) on È satisfying the constraints, the evolution 
equations (8.310) with specified lapse L and shift S have a solution (t + &ij(t),t > Aj(t)), 
which is unique in its time domain and defines a globally hyperbolic space-time (M, g) 
with associated foliation that returns the solution as just described: for each t the pair 
(&(t),#(t)), though originally defined on È, are the initial data induced on %, by g. 


In the context of Theorem 7.10 this space-time was only given up to initial-data-preserving 
isometries (as in Hilbert’s version of the hole argument), but in the context of Theorem 8.19 this 
lack of uniqueness is avoided by an explicit choice of the lapse L and shift S. Here it is crucial to 
realize that in the Hamiltonian formalism (as presented in sections 8.7 to 8.10) the Hamiltonian 
generating the dynamics (which allegedly consists of unphysical gauge transformations) is not 
the Hamiltonian constraint (7.148) itself, but the function His) appearing in Theorem 8.19, 
which is indexed by a specific choice of the lapse L and shift S. As explained in §8.1, such a 
choice amounts to a foliation (8.1) of the space-time that the initial data (%, 8k) give rise to. 

This foliation is arbitrary (as long as its leaves are spacelike), but once it has been chosen 
(if only implicitly, via the lapse and shift), it sets the standard (or reference frame) against 
which time and change in time are measured. ?? The changes we observe in the context of 
GR, from the motion of the perihelion of Mercury to the expansion of the universe, are real, 
but their quantification is somewhat arbitrary in that it may depend on the foliation. In other 
words, numerical indicators of change may depend on the reference frame against which they 
are measured, but this is nothing new. The only—but of course crucial—difference with special 
relativity is that in general relativity the “now” has become even more flexible. In Newtonian 
space-time hyperplanes of simultaneity must be horizontal. In Minkowski space-time they are 
no longer unique and may be tilted. In GR, all sorts of curved hypersurfaces %, are allowed: as 
we argued in §1.10, this is what makes GR general. But even within this increased arbitrariness, 
the causal structure of a space-time (M,g) is well defined and hence it is perfectly clear what 
moving forward in time means, namely moving along a future-directed timelike curve. 

The dissection of the “diffeomorphism” argument against time is similar, to the effect that 
once again time and change are perfectly well defined in GR, but are quantified relative to a 
foliation or reference frame, and hence are less absolute than in pre-relativistic physics.*7° 

In conclusion, we have argued that (compared to other theories) GR has no new features 
that should affect the philosophical analysis of time. It surely supports the B-series, and seems 
neutral about the existence of the A-series, i.e., about the reality of the “moving present”. 


4#22Finstein (1917a) and Hilbert (1917) held this view. Einstein imagined a ‘reference mollusk’ (Bezugsmolluske, 
ibid., p. 67), whereas Hilbert prosaically realized the frame as a system of measuring rods and ‘light-clocks’. 
423 See Maudlin (2002) and Healey (2004). 
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9 Black holes |: Exact solutions 


The theory of black holes is an interplay between abstract arguments, like Penrose’s singularity 
theorem, and concrete examples. This chapter is devoted to the latter (but the next one returns to 
the abstract theory). As a warm-up, we start with a simple example that has no true singularity 
but illustrates the remarkable interplay between coordinate singularities and horizons.*7+ 


9.1 De Sitter space revisited 


Recall from (4.92) in §4.4 that the two-dimensional de Sitter space ds? (with unit radius p = 1) is 
defined as the surface x + x + x5 = 1 in R? with metric inherited from the Minkowski metric 
n = -dx3 Fr dx? + ax. It is a Lorentzian manifold with constant curvature k = 1, topologically 
ds? = R x S!, cf. (4.94). Each of the following coordinatizations is useful:*” 


xo = sinh T; xı = cosh T cos y; x2 = cosh Tsin y; (9.1) 
xo = sinhrcos 9; xı =coshtcos Ọ; x2 = sin Ọ; (9.2) 
xo = sinhtv1-r; xı = coshżt y 1 — r?; X2 =r. (9.3) 


where T,t € R, y,@ in (—z,7), and r € (—1,1). In these coordinates, the de Sitter metric is 
gas = —d 1° + cosh’ tdy? = — sin? pd +d? = —f(r)dt” + f(r) dr, (9.4) 


where f(r) := 1—r? = (1 +r)(1—r), see also (6.2) and ensuing discussion. 


Two-dimensional de Sitter space embedded in three-dimensional Minkowski space. The left picture, 


extended to +œ along the xo-axis (= z-axis), gives the complete space, as coordinatized by (9.1). The 
picture on the right (idem dito) is the part coordinatized by (9.2). It does not contain the “singular” points 
(0,0, +1), so that the boundaries at r = +1 or 9 = +47 do not touch. Its right-hand part, called the 
static patch, is the part coordinatized by (9.3), with a metric that at least looks singular at r = +1. 


#24This section, which may be skipped at the expense of a cold start in §9.2, was inspired by $2 of Carter (1973). 


Carter discusses anti de Sitter space, technically even in a quite different way, but the spirit is similar. 


#25 The first system can be extended to y € R, giving the metric on the universal covering ds?. We will not do so. 
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The first coordinate system (T, y) covers the entire space, but it obscures the static nature of 
the metric. Staticity is obvious in the second system (t, ), in which the timelike Killing vector 
field is given by 0;. The third system, which is obtained from the second by putting p = sing 
and restricting the range of @ to (— 47,47), is pedagogically useful as it gets us closer to the 
black hole solutions below. But it is also physically motivated, because the special values r = +1 
and hence the boundaries of the region covered by the (r,t) coordinates correspond to a so-called 
Killing horizon, where d; becomes lightlike (see §10.8). Although this is suggested by the metric 
(9.4) we need better coordinates to establish this fact, since the horizon is not within the scope of 
the (r,t) system. To this end, anticipating the Schwarzschild black hole case, we solve 


drr) 1 

dr f(r)’ 

where r € (—1,1) corresponds to r, € R, with r > +1 iff r, — +». We proceed by introducing 
the analogue of the lightlike coordinates u = t — r and v = t + r in Minkowski space-time, i.e. 


1 
a, (9.5) 
l-r 


rą = arctanhr = ¿In 


(v+u); (9.6) 
(v-u), (9.7) 


u =t— r, t = 


Nie Nie 


V=t+TPx, ry = 
In terms of these, via the relation r = tanhr, the metric is easily found to be 
gas = —f(r)dudv = (1 —tanh?(4(v—u)))dudv. (9.8) 
This expression is still singular as r — +1. To remedy this at least near r = +1, we introduce 
—U=e'; Vee (9.9) 
which clearly satisfy U < 0, V > 0, and, in view of (9.9), (9.6) - (9.7), and (9.5), we have 
r—1 


UV = —2r,) = —. 9.10 
exp(-2r,) = (9.10) 
In terms of these coordinates, the metric is 
4 
gas = =p eee (9.11) 


(1-UV)? 
and the coordinates (xo,x1,%2) in terms of which as; was originally defined are 


—U-V —U+V 1+UV 
= ; X] = ; X2 = . 
1-UV 1-UV 1-UV 


xo (9.12) 
Since r — 1 corresponds to r,.— œ and hence to UV — 0, the metric is now regular for r > 1. 
We can even pass through this barrier by allowing U to also be zero or positive, at least provided 
UV < 1, where we note that UV = 1 corresponds to r = œ (whilst r = —1 corresponds to 
UV = +»). This may be motivated by allowing (9.5) also for r > 1; if for such r we redefine U 
by U = exp(u), then (9.10) remains valid also for r > 1. Furthermore, we may include V < 0 in 
the picture, which leads to the situation described in and after the diagram below. Note that the 
(U,V) coordinates fail to cover all of de Sitter space; a similar construction with r replaced by 
—r gives another coordinate system that covers the part near r = —1 and both systems together 
describe all of ds? (this will not be necessary for the Schwarzschild solution, which is easier!). 
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IV -1<r<1 -I<r<i | 


(static patch) 


Kruskal-like diagram for (part of) de Sitter space in U-V lightlike coordinates. Region I, where U < 0 


and V > 0, is the static patch, and region IV, where U > 0 and V < Q is its mirror image (in the y-z plane). 
Region II, where U > 0 and V > 0 (below the wiggly UV = 1 line) covers the open part of the right-hand 
green figure behind the x-z plane at z < 0, whereas region III, where U < 0 and V < 0, covers its part at 
z > 0. The part of de Sitter space left open in the right-hand green figure in front of the x-z plane is not 
covered by the U-V coordinates (it lies at infinity). 


Returning to our Killing field vector field 0, for the metric, in the new coordinates we obtain 


d, = Udy — V dy; (9.13) 
UV 
8(9,,9,) = Saray (9.14) 


which vanishes at r — 1, as predicted. Thus the Killing field 0, changes from being timelike in 
region I (i.e. r < 1) to being lightlike for UV = 0 (r = 1) to being spacelike for U < 0, V > 0 
(r > 1). This makes the line r = 1, or UV = 0, a Killing horizon, a concept we will return 
to in §10.8; the cross r= +1 is a bifurcate Killing horizon, see Definition 10.20. From the 
perspective of a static observer (i.e. r = constant) in the static patch —1 < r < 1, compared to 
Minkowski space-time the unusual situation arises that even in an infinite lifetime only signals 
from within the static patch will reach them, the entire rest of de Sitter space being invisible 
forever (in contrast, any static observer in Minkowski space-time will eventually be able to detect 
signals from any other physical systems anywhere in space-time). Indeed, just rotate the first 
picture in §5.10 by 90 degrees and you see the lightcones in de Sitter space: moving up in time, 
the backward lightcone does not increasingly open up, but remains confined to the static patch. 

As such, the Killing horizon is also an event horizon. It can be crossed (by a non-static 
observer, such as a light ray or an accelerating observer), but the difference with a Schwarzschild 
black hole is that an observer crossing the horizon, i.e. moving from region I to region II, will not 
necessarily fall into a singularity, because de Sitter space has none (it is geodesically complete). 
Instead, the coordinate singularity UV = 1 is simply the end of de Sitter space at infinity. As we 
shall see, in the Reissner-Nordström solution, cf. $9.5, the situation is again different. 


224 


Black holes I: Exact solutions 


9.2 The Schwarzschild solution and some of its geodesics 


After this warm-up, we now state the first curved solution to the vacuum Einstein equations:*7° 
gs = —f(r)dt? + f(r) "dr? +r’ (dO? + sin? 0d@°); (9.15) 

2m 1 
hitze = ven) (9.16) 


This is the Schwarzschild metric, defined, for the moment, for some constant m > 0 (see §9.5 
form < Dach and coordinates t € R, (0,@) € SŽ, and r > rs := 2m, the Schwarzschild radius. 
The arguably most pedagogical road towards it, which goes back to Hilbert, is as follows.*7® 


1. Staticity. Assume: (i) M = IR x &; (ii) coordinates adapted to this; (iii) arbitrary lapse but 
zero shift. Then any static solution to the vacuum Einstein equations takes the form (8.96). 


2. Spherical symmetry. Assuming £ = R?\B} (for some c > 0), we may start from 
&=M(r)dr + rdQ; dQ := d0? + sin? 0dp, (9.17) 


where we use polar coordinates (r, 0, o) in which the radial variable r has been normalized 
so as to give two-spheres S? with radius r a surface areas 477, as in flat space. 


3. In the initial value problem,**® staticity implies k = 0. Up to constant rescaling, the most 


general spatial metric & solving the constraint (8.101) is M(r) = f(r)~! for some m € R. 
4. The remaining Einstein equation (8.100) then yields L(r) = \/ f(r), and hence (9.15).**! 


Note that (given the above choice of %), asymptotic flatness (see §8.4) follows from staticity 
and spherical symmetry. In §10.9 we give two other derivations of the Schwarzschild metric, 
namely Birkhoff’s theorem 10.22, which derives the metric (and hence its staticity as well as its 
asymptotic flatness) from spherical symmetry alone, and /srael’s theorem 10.25, which derives 
the metric from staticity, asymptotic flatness, and the existence of a smooth event horizon. 


426 A solution equivalent to this one was first found by Schwarzschild (1916). Up to differences in notation, it 
was stated in the above form by Droste (1916), Hilbert (1917), and Weyl (1917). Karl Schwarzschild (1873-1916) 
communicated his solution in a letter to Einstein dated 22 December 1915, written from the Russian front. He 
died on May 11, 1916, though not from the War but from the rare autoimmune skin disease pemphigus. Johannes 
Droste (1886-1963) was a PhD student of Lorentz, who did not know Schwarzschild’s work but (re)discovered the 
solution a few months after him. Droste was a professor of mathematics at Leiden from 1930-1956. Hilbert and 
Weyl both cite Schwarzschild, whose solution differs from these later versions since, like Einstein at an earlier stage, 
he worked in unimodular coordinates (i.e. det(g) = —1). See Antoci & Liebscher (2001) and Antoci (2003). 

#27 In physical units, m = GM/c*. The constant m equals the mass of the asymptotically flat space-time (9.15), 
cf. (8.108) etc. Alternatively, a static observer is described by the four-veolcity u = f (r)! 29,, normalized to 
g(u,u) = —1, which gives an acceleration of Vu = mr~*0,. If we replace dt? in the metric (9.15) by c?dr?, as 
we should in physical units, this is the same formula as in Newtonian gravity. Finally, the Schwarzschild radius 
rs = 2m can be found-in physical units—from Newtonian gravity as the critical radius of a gravitating ball with mass 
m for which the escape velocity v equals the speed of light c; indeed, one has ie? = Gm/ rs, i.e. rg =2Gm/c?. 

428 See Hilbert (1917), of which O’ Neill (1983), chapter 13, gives a modern presentation. 

429 A deeper perspective on spherical symmetry will be given in $10.9 in connection with Birkhoff’s theorem. 

430 As an initial-value problem the Schwarzschild case with initial data on © = IR*\B? is unsatisfactory because as 
a Riemannian manifold (%,£) is incomplete, even if c = 2m. This will be resolved by the Kruskal solution. 

‘31 The simplest way to get there is to solve AgL = 0, which comes down to (f(r)'/?r?L’(r))’ = 0. This is solved 
by L(r) = y f(r), which also solves (8.100), or by L(r) = C, which does not, cf. Schoen (2009), Lecture 5. Note 
that in the above Hilbert-style derivation the Ansatz “L = L(r)” is supposed to follow from spherical symmetry, too. 
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On the nose, the solution (9.15) applies to both r > 2m and 0 < r < 2m, and as we shall see 
in §9.3, the value r = 2m is merely a coordinate singularity. Although in the present section we 
restrict ourselves to r > 2m, it is worth mentioning that r = 0 is a genuine singularity, both in the 
sense of the singularity theorems and in the sense that curvature blows up: this can be detected 
in a coordinate-free way through the so-called Kretschmann scalar 


48m? 
ROE Ro guy es (9.18) 


76 

If a star has radius R > 2m, then its interior is modelled by some nonzero energy-momentum 
tensor, so that the vacuum Einstein equations to which (9.15) is a solution are only relevant for 
r > R. This is the case, for example, with our Sun. If, on the other hand, R < 2m, then the only 
physically stable situation to which a static solution like (9.15) could possibly apply is R = 0, 
which describes a black hole. See footnote 458. In that case, the vacuum solution (9.15) applies 
to both r > 2m and 0 < r < 2m. Since all black holes in the universe are believed to rotate (and 
hence are stationary but not static), it seems that the Schwarzschild solution for 0 < r < 2m does 
not describe anything in Nature; one would need the Kerr solution instead (see $9.6). However, 
one can do a few simple calculations about the metric (9.15) that are hardly changed by rotation 
and explain key features of the famous image of the supermassive black hole in M87:**7 


First Image of the Supermassive Black Hole in M87, revealed on April 10, 2019.**° 


A black hole obviously does not emit any radiation itself. But if it is “illuminated” (at a 
typical radio astronomy wavelength like 1.3 mm, so that the colors are fake), then some deflected 
photons may reach us and provide us with an indirect image. In the case at hand, illumination 
comes from a thin accretion disc whose constituents on average move around the black hole in 
circular geodesics and emit photons, converting gravitational energy into radiation. The aim 
of the following calculations is to show that the photon capture radius, i.e. the radius of the 
central dark disc known as the black hole shadow, is a = \/27m, instead of 2m as one would 
find by (wrongly) identifying the disc with the interior of the black hole as defined by its event 
horizon. Furthermore, we will give an idea of the origin of the bright area, which is a blurred 
image of the photon sphere of the black hole, which is located at r = 3m (from which the step to 
a = /27mis straightforward geometry). The key to the structure of the black hole shadow is the 
existence of (unstable) circular photon orbits at radius r = 3m (and no other value).*”* Perhaps 
paradoxically, gravitational lensing makes this radius the edge of the shadow. The instability of 
all circular geodesic orbits of massive particle at radiuses r < 6m also plays a role. 


432The Event Horizon Telescope (2019a) expects only a 4% change in a between Schwarzschild and Kerr. 

433 Source: https: //eventhorizontelescope.org. Credit: The Event Horizon Telescope Collaboration. 

#34 This was noted by Hilbert (1917)! Modern references are Luminet (1979) and Event Horizon Telescope 
Collaboration (2019ab). See also Misner, Thorne, & Wheeler (1973), chapter 25, and Chrusciel (2020), §3.9. 
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To start, we describe geodesics (which as always are affinely parametrized by definition) 


7(s) = (r6s),r(s),0(5),(5)) (9.19) 


in the Schwarzschild metric (9.15). In the absence of off-diagonal terms, the vector fields 0, are 
orthogonal, from which it is easy to show that the geodesic equation (3.24) becomes 


3 
—(gupxt") = 4 $ (%”)Ougw, (9.20) 
v=0 
where x# = dx" (s) /ds, and u = 0,1,2,3 is fixed. Then u = 0 gives d( ft) /ds = 0; u = 2 gives 
d(r?6)/ds = r” sin 0 cos 0@7; and u = 3 gives d(r°sin? 0@) /ds = 0. Hence we may set 
f(r(s))é(r(s)) = E; O(s) = 2/2; r(s ġ(s) =L, (9.21) 


where E and L are constants, interpreted as energy and angular momentum, respectively." The 
case L = 0 gives radial motion (i.e. at constant 0 and ọ). If also guy Y“ ý” = 0, then 


t(s) = (s+ 2mln |s|) +C; r(s) =s+2m, (9.22) 


for constant C, gives the radial lightlike geodesics, initially with s > 0 and hence r > 2m (see §9.3 
for r < 2m). Radial lightlike geodesics do not contribute to the black hole shadow, whereas radial 
timelike geodesics of massive particles do not contribute to the accretion disc, and therefore do 
not produce the photons in the EHT image either. Hence for understanding this image we may 
assume L # 0. In that case, we may invert p(s) to make r a function of @ instead of s. 

We now use the fact that for geodesic motion the combination guy Y“ ý” is constant, i.e. 


ëmt T =A, (9.23) 


with e.g. A = 0 for photons. Using (9.21), eq. (9.23) may be written in terms of & := E? / L? as 


1 dr\? 2m 1 2? 
(2) +V()=&; V(r) = (1- )-G Ä a) (9.24) 


where in the massless case & is usually called 1 /b*, with impact parameter b = L/ E. Thus we 
can describe geodesic motion near a black hole as motion in a potential V, where @ plays the 
role of time. In fact, the second (u = 1) entry in (9.21) can also be derived from (9.24), viz. 


ldr 2 a 
d'r u dr _ an (9.25) 
rdo? r’ \dọo dr 
We start with the massless case A = 0, so that the potential V in (9.24) becomes 
fir) _1 2m 
V(r) = nr ae (9.26) 


This potential is plotted below. It has a maximum at r = 3m, at which the critical energy is 


&. = V (3m) = 1/ (27m?). (9.27) 


435 As we shall show systematically for the Kerr metric, see §9.6 from eq. (9.122) onwards, the conserved quantities 
E, L, and 7/2 come from three Killing vector fields K for the Schwardschild metric, taking the form g(7, K). 
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1 PES 


—40,02 


—- 


V(r) in units ofm, i.e. V = 0 at r = 2m and V is maximal at r = 3m, where V (3m) = 1/27m?. 


e The photon sphere is the orbit r = 3m (i.e. dr/d@ = 0) at the critical value & where 
V takes its maximum. Since V’(3m) = 0, it follows from (9.25) that the circular orbit 
r(@) = 3m is a geodesic. This orbit is unstable,* since V" (3m) = —2/(81m*) <0. 


e Photons with & > & starting at r > 3m cross the barrier and fall into the black hole. 


e Photons with & < & are (eventually) reflected at the periastron re > 3m where V (re) = & 
and then increase r again (perhaps after having orbited the black hole), as in the artist 
impression on the next page. Such photons cannot cross the photon sphere, but depending 
on their energy they can come arbitrarily close to it. Almost all photons detected on earth 
belong to this category, which explains both the relatively sharp edge of the black hole 
shadow at r = 3m, or rather a = /27m, see (9.28) below, and the bright area around it.4°’ 


We now explain why the apparent radius a of the black hole shadow does not equal a = 3m but 
a= v27m. (9.28) 


Let n be the (very very very) small angle between the radial direction from us to the center of 
the black hole,*** given by the vector X = —d,, and the direction in which we see the photon. 


436There is one exception making the photon sphere “attractive”: photons whose “energy” is exactly equal to & 
that are not already at r = 3m will asymptotically spiral towards the photon sphere (and hence are invisible to us). 

437The shadow is not absolutely black, since there is some leakage from photons coming from 2m < r < 3m. See 
e.g. Narayan, Johnson, & Gammie (2019), also for a general explanation of the shadow. All photons in the image of 
the black hole in M87 are produced as synchrotron radiation by either a hot diffuse plasma accreting onto the black 
hole, or a collimated plasma jet. 

438We quote from https: //blackholecam. org/research/bhshadow/: ‘The predicted size of the shadow cast 
by the event horizon of the supermassive black hole at the center of our own Milky Way is about 50 microarcseconds 
(that is one fifty millionth of an arcsecond, which is 1/3600th of a degree!). Although super small, this angular 
size can actually be resolved by astronomical observations using an interferometric technique at radio wavelengths, 
called Very Long Baseline Interferometry or VLBI.’ This makes the image by the Event Horizon Telescope an 
incredible technological achievement, on a par with the first detection of gravitational waves by LIGO in 2015. 
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This direction is the vector Y = (dr/d@)0,+ dg. The angle is given by the usual formula 


o(X.Y) = y 8(X,X)y/8(Y,Y )cosn, (9.29) 

which holds in Riemannian geometry as well as it does in Euclidean geometry. Eq. (9.15) gives 
dr/dp)? 

cos” n = ge (9.30) 


(dr/d@)? +r2(1—2m/r) 
Eliminating (dr/d@)? via (9.24) with (9.26) and & = & given in (9.27) gives Synge’s formula 


27m? (r — 2m) 


sin? n = 3 
r 


. (9.31) 
Here r is our distance to the black hole, and we take & = & since we want to compute the angle 
for the boundary of the black hole shadow, as explained above. On the other hand, Euclidean 
geometry as naively used by an observer at (practically) infinity in flat space-time gives 


= = tann. (9.32) 


For very small 7 we have tan ~ sinn = n. Also, in (9.31) we neglect the 2m/r? term against 
r/r = 1/r? because r is very large. Eqs. (9.31) and (9.32) then immediately yield (9.28). 
Finally, we show that there are no stable circular geodesic orbits of massive particles for 
r < 6m. First, putting dr/d@ = 0 in (9.25) gives V’(r) = 0, which in this case, i.e. A #0 in 
(9.24), unlike the massless case (A = 0) does not lead to a unique solution but to the condition 


r-3m = mA’r /L}. (9.33) 


Hence r > 3m. Using (9.33), the stability condition V” (r) > O then becomes r > 6m. 


Artist’s impression of the paths of photons in the vicinity of a black hole. The gravitational bending and 
capture of light by the event horizon is the cause of the black hole shadow .*°° 


8°Source: https://www.almaobservatory.org/en/images/photon-paths-around-a-black-hole/. 
Credit: Nicolle R. Fuller/NSF. 
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9.3 The event horizon of Schwarzschild space-time 


We now show how to cross the barrier r = 2m. In the coordinates used to express (9.15) this is 
a bit awkward,*“° since the metric is simply undefined at r = 2m. We resolve this coordinate 
singularity as in de Sitter space, see (9.6) - (9.7). We again introduce lightlike coordinates 
u =t— r, t=1(v+u); (9.34) 
v=t+n, rx = 4(v—-u), (9.35) 


where the new (‘tortoise’) radial coordinate r, = ry (r) is defined as the solution, for r > 2m, of 


dr,(r) 1 = _ 2m 
Flair: f= 1- =. (9.36) 


This fixes —co < rą < œ, corresponding to 2m < r < œ, up to a constant. The variables 


Xx = (r,/2m) —1; x = (r/2m) —-1, (9.37) 
turn eq. (9.36) into dx, /dx = 1 +x! which for x > 0 is solved by x, = x + lnx + C. Hence 


r.(r) =r +2mln = 1| = r+2mIn|r—2m| — 2mln(2m) (9.38) 


| r 
2m 
solves (9.36). One may also solve r for rą: for x > 0 we have x = W (e™), where W is the 
Lambert W-function, defined for x > 0 by W(x)e”®) = x. Up to a constant,**! this gives 


r(r,)=2m (w Ge + 1) l (9.39) 
To interpret the coordinate rx, note that if in a geodesic (9.19) we write the radial coordinate r (s) 
as r(t) by inverting t(s), and subsequently express r in terms of rx, from (9.21) we obtain 
a 
ds dt 
This relates two perspectives on radial geodesics: travellers use proper time s and undergo 
s+> r(s), whereas static observers, who by definition are at rest in (r,0,@) and use time r, 


(9.40) 


#0 During a lecture in Paris on April 5, 1922, Hadamard asked Einstein what happened if the radius of the Sun 
were less than the Schwarzschild radius (Nordmann, 1922; Biezunski, 1987). After much confusion, including 
names (and views) of the r = 2m sphere like “discontinuity” (Schwarzschild), “magic circle” (Eddington), “barrier” 
(Kottler), “limit circle” (Brillouin), and even “the death” (Nordmann)—we owe this information to a seminar 
by Dennis Lemkuhl on April 12th, 2021-at last Lemaitre (1933), $11, concluded that ‘The singularity of the 
Schwarzschild field [i.e. at r = 2m] is thus a fictitious singularity, analogous to that which appears at the horizon of 
the centre in the original form of the de Sitter universe.’ Earlier, Eddington (1924) had contributed the coordinates 
now named after him; in fact, he did not use (u,r) or (v,r) but, with an obvious typo corrected, (t,,7), where 
tx = t — 2mln |r — 2m], so that u = t — rą = tx — r (up to a constant 2mIn(2m)). This turns the metric (9.15) into 


2 2 4 
ds? = (1 ae (1 ar —dt.dr + r?(d0? + sin? 0492), 


which is well defined and Lorentzian for all r > 0. Lemaitre’s coordinates were different but had the same effect. 
Finkelstein (1958) first interpreted r = 2m as an event horizon, though not using this term, which had just been 
introduced by Rindler (1956), who in turn did not mention the Schwarzschild solution! Penrose (1968) first 
rigorously put all this together. See Godart (1992), Eisenstaedt (1993), and Earman (1995), $1.2, for history. 

#1 Take x, = f? dy(1+y—!), where € > 0 solves € = — Ing, so that in (9.38) one takes a = 2m(1+€). 
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monitor t +> r,(t). For example, it follows from (9.22), see also (9.66) below, that a future- 
directed ingoing radial lightlike geodesic approaching r — 2m from rg > 2m takes the form 


t(s) =s—2mln(—s) +C; r(s) =—s+2m. (9.41) 
It then follows from (9.40), in which (9.41) gives dr/ds = —1 and E = 1, that 
r(t) =t +C. (9.42) 


Therefore, it takes t — œ to reach rą — —», which is the same as r = 2m (infinite redshift). 
For (u,v) € IR’, i.e. (t,rx) € R? and hence still 2m < r < ©, eqs. (9.34) - (9.35) imply 


gs = -f(r)dudv+r’dO, (9.43) 


where r = r(r,) = r(u,v) through (9.39) and (9.35). This shows that radial lightlike geodesics 
are given by constant u (outgoing) or constant v (ingoing), as in Minkowski space-time: for 
r > 2m the former comes from the plus sign in (9.22) with s > 0, whereas the latter come from 
the minus sign with s < 0, as can also be directly seen from (9.38). See also (9.65) - (9.68). 

To cross r = 2m, we use Eddington-Finkelstein coordinates, in two versions: the ingoing 
coordinates (v,r) € R x (2m,ce) and the outgoing ones (u,r) € R x (2m,>), with metrics 


g+ = -f(r)dv? +2dvdr +r°dO; (9.44) 
g_ = -f(r)du? —2dudr+r'dQ. (9.45) 


These expressions suddenly make sense for any r € (0,9) = R}! Schwarzschild space-time is 
Ms = Rx R} x S$ = R x (R’\{0}), (9.46) 


with metric (9.44), where now (v,r) € R x R, and (as always) 0 € [0,7], © € [0,27). 
The reason we say that Schwarzschild space-time contains a black hole is the following.**” 


Theorem 9.1 The (future) event horizon H; = {(v,r,0,@) | r= rs = 2m} in Ms is: 
1. A smooth null hypersurface (cf. $4.6), ruled by lightlike pregeodesics (cf. Proposition 6.9). 
2. Diffeomorphic to R x S?; 
3. A one-way membrane, in that fd causal cruves cannot cross H Ay fromr <rstor> rgs. 


The last claim is predicated on a time orientation T on Ms. For r > 2m this is naturally given by 
T = œ, which is timelike for r > 2m, but (as another remarkable feature of Schwarzschild space- 
time) this vector becomes lightlike at r = rs and spacelike for 0 < r < rs. This is heuristically 
clear from (9.15), but since these coordinates break down at r = rg it is more precise to use 
(9.44), noting that 0, = d,. In the absence of a geometrically natural fd timelike vector field 
defined throughout Ms, we therefore define time orientation via a lightlike field, namely 

d 


L= =y (9.47) 


442Tn chapter 10 we will see that the three parts of Theorem 9.1 state general properties of (abstractly defined) 
event horizons: See Corollary 10.17, Proposition 10.29, and eq. (10.79), defining the event horizon, respectively. 
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defined in the ingoing Eddington-Finkelstein coordinates (v,r,0,@). This is crucial, since 
although the coordinate r is the same as in the original (t, r,6, o) coordinates, the vector field d, 
is different.‘ As in (5.80), we then define the cone of future directed (fd) timelike vectors by 


={X, 67; |) < 0}. (9.48) 


A timelike vector X, € TMs is future directed (fd) iff gx(L,,Xx) < 0. Moving back to the original 
coordinates (t,r,@,@) it is easy to check that (9.47) - (9.48) make 0, timelike and fd for r > 2m, 
whereas they make —0, in the original coordinates (which is spacelike for r > 2m and lightlike 
for r = 2m) timelike and fd for r < 2m. The disadvantage in using a lightlike field like L to define 
time-orientation is that one cannot define a general lightlike vector N to be fd iff g(N,L) < 0 or 
even < 0, since the former (< 0) fails for N = L whereas the latter (< 0) would make N = —L 
fd. Hence this criterion is restricted to lightlike vectors that are not proportional to L. 

Proof, Claim 2 follows from the coordinate definition of H, which also gives smoothness. The 
normal N of a hypersurface defined as a level set f = c is given by N = (df),, which gives 
N=0,+f(r)d,. Hence g(N,N) = —f(r) +2f(r) = f(r), which vanishes at r = 2m. Thus the 


normal N of H p is a lightlike vector on H a and this by definition makes H a a null hypersurface. 
ann since r is constant on H a , the induced metric g on H ay is (9.44) at fixed r = rs, ie. 


= = rd O. This metric is degenerate, which again makes Hp T null, cf. 84.6. 
To prove claim 3 we adopt the notation of §4.6, relabeliag N as L, and have 


o 
=2(ad, O,-); Le, 
(A, + £(r)a,) L=-< 
of which L is lightlike only on H BS (where L = 20,), whereas L is lightlike everywhere. Note the 


correct normalization (6.58), which, given that (9.47) defines time orientation, implies that L is 
fd whenever it is causal, which is the case for 0 < r < rs. Now consider a general curve 


(9.49) 


c(A) = (v(A),r(A), (A), @(A)). (9.50) 

Using (9.44), the conditions that the curve c(-) be timelike and future directed are, respectively, 
g(¢,c) <0 s vr — f(r)v? +r’ (67 + sin? 06°) <0; (9.51) 

g(L,ċ) <0, > v>0. (9.52) 


On H;; we have f(r) = 0, which enforces * < 0. This is an open condition, which by continuity 
also holds near H = . Hence fd timelike curves must decrease r if they get near H}, which means 
that they must either stay within the horizon (r < rs) or cross it from r > rs. 

For general causal curves: (i) eq. (9.52) should be supplemented with the additional possibility 
ċ = pL for some p > 0, which clearly has 7 = —p < 0; (ii) one allows zero on the right of (9.51). 
On the horizon, the only new case this leaves (i.e. for which 7 £ 0) are the so-called rest photons 
that have r = rs and (0,@) constant, and whose lightlike geodesics solve V_L = 0 on H ne (these 
lightlike geodesics in fact rule the null hypersurface H ay ). Their urge to move outward with the 
speed of light is exactly compensated for by the central gravitational pull, so that they are at rest 
at some point on S?. Their existence does not affect the claim of the theorem. 


The proof gives us more: since f(r) < 0 inside H} we must have 7 < 0 anywhere inside H$ 
and hence any fd timelike curve within 7 = hits the singularity (but the rest photons do not!).*** 


#3 One may also do this in Minkowski space-time, where in (v,r,0,_) coordinates (v = t +r), the vector field 
L = -0, is also lightlike and fd; to see this, just note that L = d, — ð, in the original coordinates (t,r,0,9@). 

#4Jt would be a mistake to think that photons can somehow travel around the two-sphere r = rs: as soon as Ô 
and/or @ are nonzero whilst + = 0, recalling that f (rs) = 0 the right-hand side of (9.51) can obviously not be zero. 
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9.4 The Kruskal extension of Schwarzschild space-time 


The metrics (9.44) - (9.45), both defined on Ms, describe two different space-times, (Ms, g+), 
containing a black hole (g) and a white hole (g_) respectively. The latter is a time-reversed 
version of the former, as follows from the fact that (u,r,06,@) > (v = -u,r,0,_) is an isometry 
from (Ms,g+) to (Ms,g_). Note that the Schwarzschild metric is static for r > 2m, whereas 
both (9.44) and (9.45), valid for 0 < r < œ, are merely stationary and hence not time-reversal 
invariant. Furthermore, both space-times are extendible, and as such they will be combined into 
a single inextendible space-time. To this end, we introduce Kruskal coordinates (U,V) by:** 


a i ek (itt), (9.53) 


where r > 2m, and K, the so-called surface gravity at the event horizon,“ is defined by 
= 1/4m. (9.54) 


The pair (u,v) € R? corresponds to t € R and r > 2m, and hence to U < 0 and V > 0. This 
means that the metric (9.43) in terms of (u,v) applies; in terms of (U , V) this metric turns into 


32m? 
S as 2mqudV + rd, (9.55) 


in which r, so far subject to r > 2m, is regarded as a function of U and V through (9.39), (9.35), 
and (9.53). This dependence of r on (U,V) may (relatively) simply be stated as**’ 


Dy= (1 = >.) er/m (9.56) 


Clearly, the metric (9.55) is well defined for (U,V) € R? as long as r > 0. To express this 
constraint in terms of (U,V) we extend the transformation (9.53) as follows: 


a: |Z -1 ern). ve = -1 ok (rt), (9.57) 


where, using notation in which the first + refers to U and the second to V, the signs are: 


black hole space-time | white hole space-time 
O<r<2m ++ =e 
r>2m -+ +- 


Then (9.56) remains valid for r > 0, which gives 


UV <1. (9.58) 


445 Tt is worth asking how these may be found. Searching for good coordinates near r = rs, we approximate f (r) = 
f(rs) +(r=rs)f (rs) + =2&K(r—rs) +, since f(rs) = 0. Furthermore, near r = rs we approximate (9.38) 
by just keeping the logarithm, which gives r — rg ~ e2*"* /2 = e¥©—") /2K. Combining these approximations gives 
f(r) #exp(«(v—u)), which suggests (9.53). Indeed, in terms of (U,V) the metric (9.43) may be approximated by 
gx —K ?dUdV + -:-, which is regular near r = rs. And this was the whole point of the transformation! 

446The true significance of the surface gravity will emerge in $10.8. 

“47 Following Sbierski (2018a), define F : (0,00) + (-&,1) by F (r) = (1 — 3.) e”/?”, i.e. the right-hand side of 
(9.56). This is a homeomorphism with inverse F—!, so that r = F7! (UV), as long as (9.58) holds. 
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Kruskal diagram for the extended Schwarzschild space-time, which consists of all (U,V) € R? subject 
to UV < 1, where each point (U,V) is actually a two-sphere. The two wiggled lines correspond to 
r=0 < UV =1 and do not belong to the space-time; the same is true for the regions above and below 
these lines. The axes U = 0 or V = 0 both correspond to r = 2m. Region I is U < 0 and V > 0, etc. The 
Schwarzschild black hole space-time corresponds to U € R and V > 0, and hence of regions I and II. The 
white hole space-time is U € R and V < 0, and hence consists of regions III and IV. The U-V axes are 
rotated by 45° since radial lightlike geodesics (i.e. at constant 8 and Ọ) correspond to straight lines U = 
constant and V = constant, as follows from (9.55). The blue lightlike geodesic is (9.69), the green one is 
(9.70), the red one is (9.71), and the orange one is (9.72); all are future directed. All of these geodesics 
can clearly be extended: the blue one in the backward direction (so that it enters region IV), the green one 
in the forward direction (entering II), etc. This shows that the black hole space-time is extendible, as is 
the white hole one. But the total (Kruskal) space-time, containing both, is inextendible. 


Kruskal space-time (Mx,gx) is given by the following space, endowed with the metric 
Gay 
Mx = {(U,V) E R? | UV < 1} x S? (9.59) 


Schwarzschild space-time (Ms, g4) is isometrically embedded in (Mg, gx) as 
M, ={(U,V) €R?|UERV>0,UV <1} xS?. (9.60) 


i.e. regions I and II plus the line (U = 0,V > 0). For region I this follows because (9.53) 
transforms the metric (9.43) into (9.55), and we know that (9.43) is in turn equivalent to the 
Schwarzschild metric (9.44), restricted to r > 2m. For region II the metric (9.44), restricted to 
0 < r < 2m, is the unique analytic continuation of the same metric for r > 2m. This is also true 
for (9.55), and hence (9.55) must be equivalent to (9.44) in region II, too.” 


448 Historically, structures similar or equivalent to Kuskal coordinates and Kruskal space-time were also found by 
Synge, Fronsdal, and Szekeres; see Misner, Thorne, & Wheeler (1973), Box 31.1, for more information. 

449 Since regions I and III are both isometric to R? x $? with metric (9.43), and hence to the original r > 2m 
Schwarzschild space-time, one might equally well realize the full space-time (Ms, g+.) as the union of II and III. 
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Similarly, the white hole space-time (Ms, g_) is isometrically embedded in (Mx, gx) as 
M_ ={(U,V) ER? |UER,V <0,UV <1} x38’, (9.61) 
and hence corresponds to regions III plus IV.*°° Let us draw the balance between the first two. 
Kruskal space-time (Mx, gx): 


1. is static; 
2. has a good timelike vector field defining its time orientation; 
3. is globally hyperbolic; 


4. is inextendible. 


For the first point,*’' a simple computation shows that time translations t +> t + c in the original 
coordinates are transformed into 


Une Any: Vio ey, (9.62) 


which are evidently also isometries of the metric (9.55) that preserve the condition (9.56). If t is 
the original time coordinate, the corresponding Killing vector field takes the simple form 


0, = K(Voy —Udy), (9.63) 


as follows from (9.34) - (9.35) and (9.53). If we now agree that in region I the vector field 0,, 
which is timelike there, is future directed, then it follows from (9.55) that in region I, where 
U <0,V >0,r> 2m, both dy and dy are fd lightlike vector fields. Thus 


T= Oy + ov (9.64) 


is a globally defined fd timelike vector field that may be used to define time orientation, and 
which in regions I and II is compatible with the time orientation already defined by (9.47). With 
this time orientation, the Kruskal diagram displays what Theorem 9.1 proved, namely that the 
surface r = 2m is an event horizon of the black hole (i.e. I + II plus their r = 2m border). The 
event horizon at r = 2m of the white hole (9.45), i.e. III + IV plus border, plays the opposite role: 
no fd causal curve can move from III to IV, whereas many can cross from r < 2m to r > 2m. This 
follows from a similar analysis as in the proof of Theorem 9.1, with N now given by N = +0,. 
For I + IV (plus border), r = 2m is a one-way membrane permitting travel from IV to I but not 
vice versa, making I + IV a white hole. Similarly, II + III is another black hole. 

The radial lightlike geodesics (9.22) confirm this. If we choose the (affine) parametrization 
such that they are all future directed,*°* we have the following four inequivalent possibilities: 


t(s) =s+2mlns +C); r(s)=s+2m, s€ (0,0), t € (—%,%), r € (2m,œ); (9.65) 
t( s—2mln(—s)+C2; r(s) =—s+2m; se (—%,0),t € (—%,%), r € (2m,%); (9.66) 
t( —s +2mlns+C3; r(s)=-—s+2m; s € (0,2m),t € (—%,c3),r € (0,2m); (9.67) 
t(s) = s—2mlns + C34; r(s) = -s+2m; se (0,2m), t € (c4,%),r € (0,2m). (9.68) 


s) 
s) 
s) 


450Tt is also isometric to regions I plus IV; given (9.44) - (9.45) this would actually be the most natural identification, 
except that Kruskal space-time is meant to be the disjoint union of a black hole and a white hole space-time. 

#51One might think that staticity can be made explicit in Kruskal-Szekeres coordinates t = 4 (V +U) and 
x= 5(V —U), where (t,x) € R? are constrained by t?—x? < 1. In terms of these, the Kruskal diagram has the usual 


x and ¢ axes, and the metric is given by gx = Zum e 1/2 (_dt? 4 dx”) +r7dQ. But since r is implicitly defined by 
t? — x? = (1—r/2m)exp(r/2m), cf. (9.56), this form of the metric is not manifestly t-independent either. 
452This can be confirmed from (9.55), (9.64), and (9.69) - (9.72). For the last two, note that 1 — (s/2m) > 0. 
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Here c3 := —z + C3 and c4 := z + C4 with z := 2m(1 —In(2m)). In terms of (U,V), this reads 


U(s)=-C\; Ye se (0,00); (outgoing) (9.69) 
=C ta  V()=G; s € (—%,0); (ingoing) (9.70) 
U (s) = C}; V (s) = Ca my s € (0,2m); (outgoing) (9.71) 
U (s) = Ge: V (s) = C}; s € (0,2m), (ingoing) (9.72) 


where all C; and C’’ are positive constants (trivially computable in terms of the C;). 

For the third point, the x-axis in the Kruskal diagram above is a spacelike Cauchy surface, 
and the inextendibility of Kruskal space-time follows from Proposition 6.2, eq. (9.18), and a 
study of all geodesics in the Kruskal metric (which is not attempted here),*°* showing that 
all incomplete causal geodesics end up in the singularity at r = 0. Finally, the initial data 
problem whose MGHD is (isometric to) Kruskal space-time is asymptotically flat, albeit with two 
separate asymptotically flat regions of which one seems unrealistic. Hence (Mx, gx) has good 
mathematical properties, but it seems not to correspond to any (known) part of our universe. In 
agreement with this, arguments given below suggest that Kruskal space-time cannot be the end 
result of an astrophysical collapse process (whereas, as we shall see, Schwarzschild can). 

In contrast, Schwarzschild space-time, realized as either (Ms, 9+) or, isometrically, (M+, gx): 


1. is merely stationary (and not static); 

2. lacks a geometrically defined timelike vector field; 
3. is globally hyperbolic; 

4. is extendible. 


Schwarzschild space-time came about as an extension of the static solution (9.15), defined for 
r > 2m, to all values 0 < r < œ, but this extension is no longer static because of the off-diagonal 
terms in the metric (9.44). However, its maximal extension, i.e. Kruskal space-time, is once 
again static: adding a white hole to a black hole restores symmetry under time reversal.*°* As 


to the second point, in compensation (Ms, g+) does have a natural lightlike vector field, viz. 


(9.47); see the proof of Theorem 9.1. For the third, Schwarzschild is globally hyperbolic, but 
any underlying Cauchy surface & would have to extend into both regions I and II in the Kruskal 
diagram drawn above (it cannot be restricted to region I since e.g. the red lightlike geodesic 
just described and drawn would not cross it). In that case & would still carry complete initial 
data; that is, the Riemannian three-manifold (X, g) is geodesically complete. Region I is also 
globally hyperbolic by itself, with for example the positive x-axis as a Cauchy surface %7. But 
here the initial data are incomplete because many geodesics end at r = 2m, and the resulting 
space-time is once again extendible. In this case, the r = 2m hypersurface H = acts also as a 
future Cauchy horizon HÈ = dD™ Cap) \Ez for X47, seen as a wannabe Cauchy surface for the 


extension (M. 5 g+) cf. (5.182) and (10.86), which then coincides with the future event horizon. 


453 See for example O’Neill (1983), chapter 13 or Plebański & Krasiński (2006), chapter 14. The crucial result is 
Proposition 13.36 in O’ Neill (1983), which states that an inextendible timelike geodesic y : 1 > Mx in (Mx, gx) is 
incomplete iff ry(s) — 0 as the affine parameter s approaches a finite endpoint of I, with Corollary 13.37 to the 
effect that Kruskal space-time is (causally) incomplete and inextendible. 

454 Recall that a metric is static iff it is stationary for a hypersurface-orthogonal Killing vector field, which is the 
case iff it is stationary and time-reversal invariant (in the flow parameter of the said Killing field), see §8.4. 
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This reflects the extendibility of Schwarzschild space-time: for example, the radial lightlike 
geodesic (9.69) can be extended to negative values of s and then describes a photon moving from 
IV to Lin finite affine parameter “time”.*°> Similarly, the radial lightlike geodesic (9.71) can be 
extended to s < 0 and then describes a photon moving from IV to II. 


For those who are familiar with this technique (or jump to §10.2),*°° we now give the Penrose 
diagrams of both Kruskal and Schwarzschild space-time, the former in Penrose’s own hand: 


One of the first Penrose diagrams: ‘The Kruskal picture with conformal infinity represented.’**! 


r=0 it 
LX 
0 


l 


aN 


Penrose diagram for Schwarzschild space-time (M+,gx) = (Ms,g+). The green line represents the event 
horizon H aa at r = 2m. The blue line represents a Cauchy surface. The red line marks the end of the 
diagram; it does not (even) lie in the conformal completion (M4, 8x). 


455The reason we will not notice this even if white holes exist is that according to the description (9.65), it would 
require extending our time t beyond minus infinity, i.e. the “beginning of time”, to see it. 

456 For the moment, just note that (i) through a conformal transformation, infinity has been brought forward so as 
to become a boundary at some finite distance; (ii) the causal structure is the same as in Minkowski space-time. 

457Taken from Penrose (1968), p. 208, Fig. 37. See the Introduction for comments on his style. See also §10.3. 
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On the other hand, we shall see at the end of this section that Schwarzschild space-time can 
result from a realistic collapse process, and although all known black holes in the universe seem 
to be rotating and described by the Kerr metric (with poorly known angular velocities, that is), 
the Schwarzschild metric may be sufficiently close to these to call it physically realistic. 

In conclusion, Kruskal space-time has good mathematical features, but is physically awkward, 
whereas Schwarzschild space-time has exactly the opposite features. Perhaps it is a mistake to 
regard the latter as the “hydrogen atom of GR”, as many textbooks suggest. 


At the origin U = V = 0 of the Kruskal diagram (where r = 2m and t is undefined) the event 
horizon of the black hole coincides with the one of the white hole. This point (which is really 
a two-sphere whose abstract structure is that of a bifurcation surface, see §10.8) is called an 
Einstein-Rosen bridge, which later came to be seen as a special case of a wormhole. This bridge 
connects region I to region IV, but one cannot cross it since this would require spacelike (i.e. 
superluminal) travel; even any (fd) timelike or lightlike deviation from it would cause the traveler 
to fall into the black hole singularity. Nonetheless, one can study its geometry at some fixed 
value of r, i.e. as part of a slice of constant U / V), which turns out to be quite interesting. We 
restrict ourselves to the original description of the bridge by Einstein & Rosen (1935) themselves, 
since apart from some use in science fiction the idea seems to be of historical value only. 


In terms of the coordinate 
u = yr— 2m, (9.73) 
the r > 2m part of the Schwarzschild metric is 


u 


24 21 2 2 2 
Samt + A(u~ + 2m)du” + (u + 2m) dO. (9.74) 


g= 


Although u > 0 initially, this makes sense for any u € R and as such the solution describes the 
exterior regions I and IV in the Kruskal diagram. The area of any two-sphere at fixed u is 


A(u) =47(2m + u?)”. (9.75) 


This function obviously takes a minimum at u = 0, i.e., r = 2m, and increases for larger |u|. At 
fixed 0, where the spheres are circles, one may then draw the bridge as a two-sided trumpet. 


We return to the physical origin of Schwarzschild space-time, in the sense that it may be the 
final state of a stellar collapse. To this, end, the oldest and simplest generally relativistic model 
is due to Oppenheimer and Snyder (1939), whose paper played an important role towards the 
acceptance of black holes, at a time where the mathematical possibility was clear but Einstein, 
Eddington, and many other opinion leaders believed that they were idealizations and that some 
physical mechanism would block their actual occurrence in nature.“ This model describes the 
collapse of a spherically symmetric permeating dust cloud, whose energy-momentum tensor 
within the cloud is given by (7.70), whilst Tuy = 0 outside the cloud, which is taken to be a ball 


458 Very briefly: light stars retire as white dwarfs, in which nuclear burning has ended and inward gravitational 
pressure is stopped by a degenerate electron gas. In 1931, Chandrasekhar discovered that this only works for masses 
M < 1.46M., where Mọ is the solar mass. Heavier stars collapse into neutron stars (typically after a supernova 
explosion), but also these have an upper mass, as first suggested by Oppenheimer & Volkoff (1939); the current 
bound is about 2.3M.. Heavier stars have nothing to stop gravitational collapse and unless they get rid of some of 
their mass/energy they must collapse into a black hole. See e.g. Misner, Thorne, & Wheeler (1973), Joshi (2007), 
Lasky (2010), or Weinberg (2020) for the relevant astrophysics, and the references in footnote 270, as well as 
Longair (2006), for details on the history. Our brief mathematical treatment below is based on Alford (2020). 
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in R? with radius R. This model has only one free parameter, namely the total mass m (initially 
of the collapsing matter, eventually of the black hole). At any point in time t one has 


m= (4n/3)R’p, (9.76) 


where the choice of R (> 2m) reflects the choice of the origin T = 0 of (proper) time. Given this 
choice, define ty = tV 2R? /m, in terms of which Oppenheimer-Snyder space-time is given by 


M =R‘\{(t,r,0,9) | t > wr=0)}; (9.77) 


2 2 
ee (1- = dv +2 —dtdr tdr+rdO; (r>r,(t)); (9.78) 


2 2 2 2 
gos = — (- = Jar F24) di dtdr+dr+r°dO; (r< rp(T)), (9.79) 
r; r; 


where, compared to the original coordinates (t,r, 0,0), we have T = t + g(r), where g(r) solves 


2 
te). E oso 


Indeed,**? under this coordinate transformation (9.78) is equivalent to (9.15). Furthermore, in 
(9.79) the time-dependent radius of the star r, = r,(T) is defined in terms of R = r,(t = 0) by 


2/3 
r(t) = (RZ vam) (9.81) 


Hence r;(7o) = 0, which means that, as suggested by (9.77), the collapse ends at T = Tọ and 
hence for all T > to one has the Schwarzschild solution. Another critical time is T = T1, at which 
rp(Tı) = 2m and hence the star implodes through its Schwarzschild radius. Using reduced radii 


ř = r/2m,; Fp = rp / 2m; R=R/2m, (9.82) 


the relevant quantities are given by 


i . 32\ 2/3 
7,(t) = (0 - =| (9.83) 
oa sk ™ = sm (RO? -1); 1%) = Sm (07 = 3) (9.84) 


where 7) is the earliest time at which the quantity 7(t) defined below vanishes. The event 
horizon H as can be computed from the fact that, at any fixed angle (0, ), the fd “outgoing” (but 
bouncing) radial lightlike geodesic that passes through (r = 2m, 0,@) at T = 7 is given by 


F(T) =#,(t)(3 —-24/ F, (9.85) 


This takes its maximum ř = | at T = 7 and, constrained by 7 > 0, has two zeros at To and To; 
clearly, T9 < T1 < To. Together with (9.77) this gives the location of the event horizon as 


Ht = {(1r,8,9)eM | (16 < T< T1,F = F(T)) V (T > Tr = 2m)}. (9.86) 


459The metric is only piecewise smooth but satisfies appropriate junction conditions at r = rp(T). 
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Indeed, this definition is such that some point x = (T,r,0,@) lies inside the horizon, in that: 
1] LTL T and F< F(t) or T> t and? <1, (i.e. r < 2m), 


iff there are no points 
y= (t',7',0',9') EJT (x) (9.87) 


for which 7’ > tı and r’ > 2m. This, in turn, means that future null infinity .Z* cannot be reached 
from x via a lightlike curve (or any causal curve); we will formalize this later.*°° Specifically, 
lightlike geodesics starting anywhere at any time T < 7 reach infinity, whereas those starting at 
some T > 7 must start at some 7 > 7(t). The geodesics (9.85) demarcate between these two 
classes. The situation is illustrated in the pictures, which say more than the formulae: 


v 


To 


ee 


T4 


Left picture: T-r diagram of the Oppenheimer-Snyder space-time. The green area is the interior of the 
black hole and its boundary. The event horizon H E is initially the blue geodesic (9.85), and from T = 7 
onwards it is the line r = 2m. Any inextendible fd lightlike curve leaving outside the green area will 
eventually reach future null infinity J *. Any fd causal curve leaving within the grey area will stay there 
and any such fd causal geodesic necessarily fall into the singularity. Picture drawn by Edith de Jong. 

Right picture: Penrose diagram of the Oppenheimer-Snyder space-time, very slightly adapted from 
Alford (2020), redrawn by Edith de Jong. The curved line shows the evolution of the radius of the star; the 
45° line marked H A is the event horizon. This diagram combines features of the corresponding diagrams 
for Minkowski space-time (cf. $10.2) and for Kruskal space-time (given earlier in this section). 


Although the romanticism has been taken out of it, one cannot deny the physical and 
mathematical improvement over Schwarzschild (or Kruskal) space-time: Oppenheimer-Snyder 
space-time is geodesically incomplete only at r = 0, where it has the same curvature singularity 
as Schwarzschild (the vertical r = 0 line belongs to the space-time until T = Tọ), and hence it is 
inextendible-so no need for white holes. Finally, it has a complete, asymptotically flat initial 
value problem: any hypersurface T = constant at T < Tọ is a space-like Cauchy surface. 


#0See §10.3. The black hole area will formally be defined as M\J~ (J+), so that the event horizon is H Be = 
0(M\J~ (.%*)). This also gives the even horizon Hf of the Schwarzschild solution, as well as the horizons H; in 
the Reissner-Nordström and Kerr solutions to come in the next two section. 
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9.5 The Reissner-Nordström solution 


Reissner (1916) and Nordström (1918) independently extended the Schwarzschild solution to 
the electrovac case where the central body is electrically charged and one continues to assume 


spherical symmetry.*°! This requires a nonzero energy-momentum tensor (7.84) in which Fuy 
comes from the potential (Ag = —e/r,A; = 0). A lengthy calculation gives the metric 
grn = —h(r)dt? +h(r) "dr + raQO. (9.88) 
2m e 1 
h(r) = 1-—— + 5 = azr) r); re =mtVm?-e?, (9.89) 


where we assume m > 0 and |e| < m only in rewriting h(r) as (r-rı)(r—r_)/r?. For 
e >m > 0 we have h(r) > 0 and the metric (9.88), defined for all t € R and r > 0, turns out to 
be inextendible. Other parameter values require new coordinates near both r = r+, see below. 


Left: Penrose diagram for the Reissner-Nordström solution with |e| > m > 0 or the Schwarzschild solution 
with m < 0. The analogy with the corresponding diagram for Minkowski space-time (see $10.2) is highly 
misleading, since in the solutions just mentioned r = 0 is a naked (timelike) singularity, whereas in the 
Minkowski case it is a coordinate singularity. The Reissner-Nordström metric has a (naked and timelike) 
singularity at r = 0 and lacks an event horizon. It does have a Cauchy horizon, drawn in red, for the 
wannabe Cauchy surface drawn in blue. See $10.6 for details. 


Right: Penrose diagram for the (unextended) Reissner—Nordstrém solution with |e| =m > 0. The 
singularity at r = Q is shielded by a future event horizon at r = m, drawn in red, which at the same time is 
a future Cauchy horizon for the wannabe Cauchy surface drawn in blue, whence we write H} = H es = He : 
The singularity is timelike (the m > 0 Schwarzschild singularity is spacelike). The two red lines marked 
H- are boundaries, but in the extensions discussed below they will be past event and Cauchy horizons. 


46! There is a Birkhoff-style derivation of this metric that only requires spherical symmetry (Hoffmann, 1932ab). 
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Penrose diagram for the (unextended) Reissner-Nordström solution with 0 < |e| < m, as redefined in 
ingoing Eddington-Finkelstein coordinates (v,r). This time there is both an event horizon Hg atr=r,, 
viz. the green center-NE line, and a Cauchy horizon He atr =r_ <r for say the wannabe Cauchy 
surface in blue, drawn as the red center-NE line. The other green and red lines are event and Cauchy 


horizons, respectively, for extensions of the space-time.*°* 


Key intuition about this metric comes from the Penrose diagrams above. Although charged 
black holes probably do not exist, pedagogically the Reissner-Nordström solution is a useful 
intermediate case between Schwarzschild and Kerr.*° There are three very different regimes, 
which however have the same curvature singularity at r = 0, which is given by,*™ cf. (9.18), 


48m? 


pouv -_ 


(rÉ — 2me’r + er) (9.90) 


462 See (9.94) below for (v,r), The coordinate transformations leading to the conformal completion implicit in this 
Penrose diagram are given in Hawking & Ellis (1973), p. 157, but one may also use Penrose’s formulae (10.72) and 
(10.73) for (U+, V+) instead of (U,V), as defined in (9.103) below. In any case, the green SE-NW line corresponds 
to v = —co, whereas the green SE-NW line upt to HÈ corresponds to v = 9; at HÈ the (v,r) coordinates break 
down, as explained in the main text. One has a similar Penrose diagram for the outgoing Eddington-Finkelstein 
coordinates (u, r), which contains a white hole, see e.g. Poisson (2004), §5.2.3. These can be combined, see below. 

463 An exhaustive study of the Reissner-Nordström space-time and its properties may be found in Chandrasekhar 
(1983), chapter 5. For briefer treatments see also Graves & Brill (1960), Carter (1973), Simpson & Penrose (1973), 
Hawking & Ellis (1973), §5.5, Poisson (2004), §5.2, and Plebański & Krasiński (2006), §5.2.3. 

464See Henry (2000), who even computes the Kretschmann scalar for the Kerr-Newman metric. 
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e m<0or0 <m<|e|. Then h(r) > 0 and the metric (9.88) is non-singular except at r = 0. 
This case is similar to Schwarzschild with m < 0. The metric (9.88) is defined for all 
0 <r < œ and the space-time is inextendible. With T = 0, as the obvious time orientation 
(which for |e] > m stays timelike for all r > 0, as opposed to m > 0 Schwarzschild), there 
are both future-directed past-incomplete timelike curves emanating from the singularity 
and future-directed future-incomplete timelike curves crashing into it, so that the singularity 
behaves like a point omitted from some (in fact many) double cone(s) J(x,y). Thus a 
space-time with a timelike singularity cannot be globally hyperbolic.*°> 


e 0<m= lel, called extremal, see below. We will treat this as a limiting case of: 


e 0< |e| < m. Though all cases are unphysical, this one is “relatively realistic”. 


In the last two cases we have to deal with zeros of h(r), where the metric (9.88) breaks down. As 
in the Schwarzschild case with m > 0, this is resolved by turning to better coordinates, and indeed, 
we proceed in almost the same way. The tortoise coordinate r, now solves dr, /dr = h(r)~!, so 
that, with the simplest integration constant, eq. (9.38) is replaced by 


2 2 
ri A 

r =r+— —h(r-rı)- ——— In(r-r_); 0< e| <m). (9.91) 

(ry —r_) ( +) (=e) ( ) ( ) 
Up to a constant 2mIn(2m), this reduces to (9.38) if r_ = 0; we still have the boundary condition 
limy, r»(r) = —œ. The Schwarzschild surface gravity k = 1/4m is now replaced by 
r+—r_ m? — e? 
eS — = 9.92 
Hr 2 (r+) 272 r2 ( ) 


Thus the metric with |e| = m > 0 has zero surface gravity, making it an extremal black hole. 
The counterparts of the Schwarzschild(ish) metrics (9.43) and (9.44) - (9.45) are given by 


gry = —h(r)dudv+r7dQ; (9.93) 
gen = —h(r)dv? +2dvdr + dO; (9.94) 
grn = —h(r)du? — 2dudr + r’dQ, (9.95) 


where u and v are defined as in (9.34) - (9.35), and as before r = r(u,v) via (9.35) and the 
inverse of the counterpart of (9.38). Taking (9.94) to define the metric gry, we may now define 
Reissner-Nordström space-time as (MRN, 8RN) where the manifold Mpy is the same as the 
Schwarzschild manifold Ms defined in (9.46), and time orientation is given by declaring that the 
lightlike vector field (9.47) be future directed, just as in the Schwarzschild case. Under the map 


(v,r,0,ọ) > (u = —v,r,0,ọ), (9.96) 


this “ingoing” space-time is isometric to the “outgoing” one based on the same manifold, but 
using the metric (9.95), and +0, for time orientation. 


465Tn Penrose’s (1979) terminology, this makes the singularity timelike. In addition, it is naked in being visible far 
away, since it is not covered by an event horizon. In contrast, the Schwarzschild singularity for m > 0 is spacelike 
and is covered by an event horizon. A singularity is spacelike/timelike/lightlike iff it has these properties in a Penrose 
diagram. In this case, where the singularity is located at r = 0, spacelike also means that for small enough € > 0 the 
hypersurface r = € is spacelike. This is the case for Schwarzschild with m > 0, since its normal 0, is timelike for 
0 < r < 2m, whereas this normal is spacelike for all cases of Reissner-Nordström, making the singularity timelike, 
see Definition 4.15. These things come to a head in cosmic censorship, see § 10.4. 
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The main properties of Schwarzschild space-time are arguably that (i) it has a spacelike 
curvature singularity at r = 0 (which makes it geodesically incomplete), which (ii) is covered by 
an event horizon, as expressed in Theorem 9.1. Reissner-Nordström has this singularity, too, but 
if 0 < |e| < m it is covered by two event horizons (and this also turns out to make it timelike): 


Theorem 9.2 If 0 < |e| < m, the sets H+ = {(v,r,0,@) | r = r+} C Mpy (at which h = 0) are 
null hypersurfaces diffeomorphic to R x S*. Each H4 acts as a one-way membrane towards 


smaller values of r. If |e| = m, then H} = H_ = H, which has the same properties as each H4. 


Proof. The proof of Theorem 9.1 is easily adjusted, cf. the proof of Theorem 9.3 for details. 


This makes Reissner-Nordström space-time totally different from the Schwarzschild one, even 
in the extremal case with a single event horizon. The key properties for 0 < |e| < m are: 


1. The outer horizon H+} = H a is the event horizon, since it is the boundary inside which 
future (null) infinity can no longer be reached; it is the analogue of H a in Theorem 9.1. 


2. The inner horizon H_ = HÈ is a Cauchy horizon for wannabe Cauchy surfaces (cf. Defi- 
nition 5.36). Cauchy surfaces do not exist and (Mpy, gry ) is not globally hyperbolic.*©° 


3. The singularity at r = 0 is timelike and repulsive (except for radial lightlike geodesics). 


4. The maximally extended space-time has an infinite number of regions (and singularities). 


In the extremal case 0 Æ |e| = m all this remains true: although there is a single event horizon in 
that case, it plays the role of both the event horizon H a of Schwarzschild space-time and of a 
Cauchy horizon (which is absent in Schwarzschild space-time). For |e| > m > 0, finally, only 
property 3 remains, but as a relic of no. 2 also that case is not globally hyperbolic.*°” 

Except for the first, which is Theorem 9.2, we will not prove these points (which could be 
done by studying all geodesics and causal curves), but just argue for them, and relate them. 

The most intuitive point is 3. In the coordinates (t,r,@,@) the vector field R = —9, is 
spacelike for r > r+, lightlike at r = r+, and timelike at r_ < r < r+. In this region R is future 
directed and hence r must decrease, so that r = r_ is reached, and crossed. If 0 < r < r_, then 
R becomes spacelike once again, whereas the Schwarzschild-R only changes its causal nature 
once, namely when crossing the single event horizon H7, and so R is timelike as r — 0. This 
makes the Schwarzschild singularity spacelike (as the normal vector to the r = € hypersurface 
for small € > 0 is timelike) and unavoidable (since fd timelike curves must decrease r), whereas 
the Reissner-Nordström singularity is timelike, since exactly the opposite causal situation reigns. 

This suggests that the Reissner-Nordström singularity at r = 0 can be avoided; what’s more, 
a fd timelike geodesic cannot even reach it because it is repelled! We only show this for radial 
geodesics y (which by their very nature should have the best chance of hitting the singularity), 
but it is true in general. Taking (0, @) constant, we parametrize y(s) = (v(s),r(s)) affınely such 
that g(Y,Y) = —1, where ý = dy/ds as usual. In the ingoing (v,r,@,@) coordinates this gives 


hy? — 2v7 = 1, (9.97) 
Furthermore, since 0; = o, is a Killing vector, the energy E = —g(¥,0;) takes the constant value 
E = h-i. (9.98) 


466Tn 810.7 we will see that Cauchy horizons are always ruled by lightlike geodesics, sharpening Theorem 9.2. 
467 As one can infer from its Penrose diagram: no wannabe Cauchy surface, drawn as a more or less horizontal 
line, is hit by a future inextendible timelike curve lying above it that hits the singularity. 
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Combining (9.97) - (9.98) we see that, similarly to (9.24), the motion is controlled by 
i? +h(r) =E?. (9.99) 


That is, h(r) acts like a potential. Since h(r) ~ e?/r? for r — 0, this gives a strong repulsion. On 
the other hand, incoming fd radial lightlike geodesics are simply given by constant (0, @) and 


v= Cy; r(s) = -s +C, (9.100) 


where Cı and C > 0 are constants. Since we now have g(7, 7) = 0, eq. (9.97) is 0 = 0 whilst 
(9.98) is E = 1, from which nothing can be concluded. Yet (9.100) gives r(s) > 0 as s > ©. 

We return to our fd timelike geodesic observer, who (unlike the incoming radial lightlike 
geodesic just discussed), after having crossed the future Cauchy horizon HÈ bounces back from 
the singularity, increases r, and is even able to cross the next Cauchy horizon Họ at r = r_. 
Strangely enough, this makes him outgoing rather than ingoing. Naively, this would lead him 
back to the area II where he came from, but according to Theorem 9.2 this is impossible for a fd 
timelike curve. Therefore, he has entered a new region, interpreted as the interior of a white hole, 
from which he can move on to cross its “anti” event horizon H; and enter a new asymptotically 
flat region. The process may then be repeated, which leads to (and in turn is illustrated by) 
the Penrose diagram on the next page. Adding the south-east diamonds I and II to the original 
north-west diamonds I and II should be familiar from the Kruskal extension of Schwarzschild 
space-time, whose extension ends there; recall that regions II are then triangles whose northern 
or southern borders represents a singularity. Now, however, our ingoing fd timelike observer can 
cross either of the red r = r_ lines into one of the triangular regions III, and move on to the new 
region II as described above. To make this space-time geodesically complete also into the past 
(except at the singularities), one extends the original space-time analogously, to the “south”. 

If we take the blue horizontal bar as a wannabe Cauchy surface % in the extended space-time, 
the first two red lines above it form its future Cauchy horizon H” (X) whilst those to the south of 
the region II below the blue line form its past Cauchy horizon H~ (£). Indeed, in the triangular 
regions III north of this horizon one may initiate inextendible timelike curves that crash into 
the singularity southward and go on indefinitely northbound. Such curves do not cross %; 
contradicting the definition of a Cauchy surface (and similarly to the past). 


A similar tower may be drawn for the extremal case 0 Æ |e| = m, where compared to the 
previous case things are simplified by the coalescence r+ = r_, so that there is just one type of 
horizon H= that is simultaneously an event horizon Hg and a Cauchy horizon H-. Thus one 
simply places the entire diagram shown on top of and below itself in such a way that the red H7 
lines match, and repeats this procedure. A given region III then acts as a black hole towards the 
lower region I and as a white hole towards the upper one. The difference with the Schwarzschild 
solution comes from the difference between the functions 


f(r) = (r-2m)/r; h(r) = (r-m)?/r? (9.101) 


in the metric, which means that in the original coordinates (t, r,0, o) neither d; nor 0, changes 
its causal nature if it crosses the horizon. In particular, d, remains spacelike and this makes the 
singularity timelike (as it is in the two other regimes of the solution). 
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Penrose diagram of the maximally extended Reissner-Nordström solution for 0 < |el < m. Region I 
corresponds to r > r4, region II tor, <r <r, and region III to O <r <r_. The repetition is such that 
a green cross with the associated null infinities %~ is added both above and below the red crosses, after 
which a red cross and the associated r = 0 singularities are added above and below, etc. Compared to the 
earlier diagram, the new regions make the space-time geodesically complete except at the singularities, 
and hence it is inextendible. Each green cross is an event horizon (and even a bifurcate Killing horizon, 
see $10.8), whereas each red cross is a Cauchy horizon with respect to some generic wannabe Cauchy 


surface, like the one drawn in blue.*® 


468Redrawn from Hawking & Ellis (1973), page 158. The labeling of the regions differs from the Kruskal one. 
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It is interesting to see some of the differences between Schwarzschild and Reissner-Nordström 
from the procedure used to merge the solutions (9.44) and (9.45) into a single (Kruskal) space- 
time.*° For 0 < |e| < m, analogously to footnote 445 we use (9.34) - (9.35) and approximate 


Ls 1 p 
-ry eta 2K 4 ek). 9.102 
rT. PA 2K,” ( ) 


where the upper sign applies to r > r+ and the lower one to r- <r < r+. This gives the 
approximation h(r) ~ texp(K+(v—u)). Defining u and v as in (9.34) - (9.35), as r r+, the 
metric (9.88) is then regularized by the new coordinates 


Up= pe = V} = e, (9.103) 
since gry I —K,7dU;dV3. +--+, as before. More precisely, since (9.56) is now replaced by 
UV, = en = (er, )(rr_) TA, (9.104) 
Together with (9.103) and (9.93), this gives the exact metric in U.-V+ coordinates as 
2 e ar 1 2, 2 > 
ds? = — (r—=r_)!+0/ dU dV} + dQ, (9.105) 
j Kir 


where r = r(U},V;) is defined via (9.104), i.e. via (9.103), (9.91), and (9.34) - (9.35), as 
usual.47 For r œ r_ wehaver-r_ ~ (U,V), so that r = r_ corresponds to U} V} = œ 
and hence is out of the range of the (U4, V) coordinates. To get to r = r_ and a fortiori to 
r — 0, we introduce new coordinates (U_,V_) by making the replacements 


Ky ~ K = 1h/(r_) = =a (9.106) 
r2 
Up ~ U= Fe; Vy ~V = e; (9.107) 
UV w UV = 2 e A rar lrer el; (9.108) 
—2K_r 
ds? = -by (r — r4) HAU dV +740, (9.109) 
Kr 


where this time the upper sign refers to r_ < r < r+ and the lower one to O < r < r_. This metric 
is singular at r = r4, so that unlike the Schwarzschild case, but somewhat like de Sitter, there 
isn’t a single coordinate system that adequately describes the merger. Referring to the above 
Penrose diagram, the interior of the large diamond consisting of the regions I and II from the 
original space-time, plus the new regions I and II south-east of those (which totality is similar to 
the entire Kruskal space-time) is described by the (U,V) coordinates, which however break 
down near the border lines r = r_ of the large diamond (both north and south). Unlike the 
Kruskal case (in which r_ = 0) these can be crossed, but this crossing must be described in the 
new coordinates (U_, V_), which can be started in regions II and extend to regions III, etc. 
However interesting all this may be, similar comments apply as in the Schwarzschild case: 
realistic collapse is not expected to lead to solutions (and Penrose diagrams) like this, although 
an exact solution showing this seems lacking.*”! In addition, the interior part of the solution 
seems unstable; in particular, the Cachy horizon is believed to turn into a curvature singularity 
under small perturbations, including even such small effects as an observer trying to cross it. 
This is a major point in favour of Penrose’s (strong) cosmic censorship hypothesis; see § 10.4. 


469 We here essentially follows Poisson (2004), §5.2. 
fe = 0, then (9.105) is not quite (9.55); it would be if the constant —2m1n(2m) in (9.38) were omitted. 
471 See e.g. Sanchis-Gual et al. (2016) for some non-rigorous work in this direction. 
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9.6 TheKerr solution 


The last, and physically most relevant, solution to the vacuum Einstein equations we discuss is 


2 , = 
= dr? + Tr (asin? Odo- dr)’ +p?(A !dr? + dO) + (r + a?) sin? 04? 
2mr\ „> 4marsin? 0 p? > 2m2, Pan? 2 
=- (1-27) dt” — p? dtdp j Aa rp do a odo k (9.110) 
A:=r—2mr +a =r |1-— +>]; (9.111) 
r rP 
p? = r? +a? cos? 0; (9.112) 
È := (a? +77)? — a Asin? 0. (9.113) 


This is the Kerr metric,*'* parametrized by m > 0 and a € R\ {0}, expressed in Boyer-Lindquist 


coordinates.*’> These coordinates (t,r,0,_) look the same as those in the Schwarzschild 
solution (9.15), but this analogy is partly misleading and only makes sense for r > 2m.*’* In that 
case, (7,9, o) are the usual spherical polar coordinates, and ¢ is the usual time coordinate. In 
particular, an important difference with both the Schwarzschild and Reissner-Nordström space- 
times is that the “radial” coordinate r now takes values in R (as does 7). The set {(t,r, 0, @) } 
where 0 € [0,2] and ọ € [0,27) } is therefore (topologically) a two-sphere at any fixed (t,r), 
even if r = 0. The curvature singularity, which in the Schwarzschild metric is located at r = 0 and 
hence is a point at any fixed time t, is now located at p? = 0. At fixed f this set is (topologically) 
a circle (called a ring in this context), and the entire set p? = 0, i.e. r= cos 0 = 0, is given by 


&:={(t,r=0,0 = 12,9) |t € R,ọ € [0,27)}. (9.114) 
Indeed, the Kretschmann scalar for (9.110) is given by the following generalization of (9.18): 


48m? 
sir (r? — a’ cos? 0) (p* — 16a*r* cos? 0), (9.115) 


RPORYRoauv = 
which blows up in &. Apart from (9.114) there are no other singularities of (9.110) except 
coordinate issues, and so we take the (preliminary) manifold underlying Kerr space-time to be 


M = (R? x 5?)\2. (9.116) 


We have not yet found the right coordinates on all of (9.116), since the metric (9.110) looks 
singular also outside the ring Z, namely where A = 0. This can be overcome in a similar way 
as for the Schwarzschild and Reissner-Nordström metrics, namely by passing to Eddington- 
Finkelstein coordinates. However, before doing so, we can already look at some interesting 
geodesics. The details depend on a case distinction similar to the one for Reissner—Nordstr6m: 


472The is metric was discovered by Kerr (1963); see Melia (2009) for the history of this discovery as well as 
biographical information about Kerr. An exhaustive mathematical treatment of the Kerr metric is given in the 
monographs by Chandrasekhar (1983) and O’Neill (1995), whereas the volume edited by Wiltshire, Visser, & Scott 
(2009) is more physics oriented. The introduction to this volume, available in preprint form as Visser (2006) is a 
nice first introduction, as is Heinicke & Hehl (2015). Among the general GR texbooks, the one by Plebanski & 
Krasinski (2006) also gives very detailed coverage. The Les Houches lectures by Carter (1973) remain valuable. 

473These coordinates were introduced in Boyer & Lindquist (1967). During a brief visit to the Center for Relativity 
at the University of Texas, Austin, which is also where Kerr discovered his metric, Robert Boyer (1933-1966) was 
tragically killed by Charles Whitman in the University Tower shooting massacre on August 1, 1966 (Melia, 2009). 

474This places (t,r,0,@) outside the ergosphere and hence a fortiori outside any relevant horizon, see below. 
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e 0 < |a| < m, called the slowly rotating case (comparable with 0 < |e| < m), which is 
astrophysically relevant. Then A has two distinct zeros, which, as in (9.89), are given by 


re =mtVm? —a?. (9.117) 


It turns out that r = r+ gives the event horizon (as in the Schwarzschild case a = 0, where 
r4 = 2m), but r_ is a Cauchy horizon (as for the Reissner-Nordström metric). 


e 0 < |a| = m, the extremal case (comparable with |e| = m), where r} = r_. 
e 0<m < |a], the rapidly rotating case (comparable with 0 < m < |e|), where p > 0. 


The interpretation of these cases, suggested by their names, comes from the fact that, due to it 
being stationary, axisymmetric, and asymptotically flat, the Kerr solution has well-defined total 


mass/energy & and angular momentum J. These may be defined by the Komar formulae: 


I 1 
= =o BY: — a Lav 
E= an etc V T”; J: TR, AY, (9.118) 


where (at least in the asymptotic region) T = d} is the Killing vector field defining stationarity, 
A = 0g is the Killing vector field defining axial symmetry. The surface element is given by 


dOyy = (nyNy —nyNy)d?o, (9.119) 


where d?o was defined below (8.103). One takes a spacelike wannabe Cauchy surface & C M 
(since Kerr space-time is not globally hyperbolic this is all one can do), with fd timelike normal 
N, containing a sphere S? in the asymptotically flat region, with outward normal n relative to the 
embedding S? > &. It can then be shown that & and A are independent of X and S?, and yield 


é =m, SF =am. (9.120) 


Thus the metric (9.110) describes a space-time rotating with constant angular velocity. It is 
stationary but not static: the solution is not invariant under t ++ —t but under the double inversion 


(1,9) = (-1,—9). (9.121) 


This is what one would indeed expect of an object rotating with constant angular velocity, where 
p is the angle of rotation, since reversing time also reverses the direction of rotation. 

We now turn to geodesic motion, starting with a more abstract perspective on the Schwarzschild 
constants of motion E and L, cf. (9.21). Let X be a Killing vector field, so that -gg = 0, i.e., 


g(VyX,Z) +9(VzX,Y) =0 forall Y,Z € X(M). (9.122) 


For an observer with four-velocity u = Y moving along a causal geodesic y, eq. (9.122) plus 
the geodesic equation V,,u = 0 make g(u,X) a constant of motion, since taking Y = Z = u in 
(9.122) gives 24/5) (V,.X,u) = 0, and hence, since V „u = 0 because y is a geodesic, 


d 
T816) (Xu) = VulSy(s) Xu) = Ey (VuX.u) + 8y(s)(X,Vutt) = 0. (9.123) 


#75 See e.g. Gourgoulhon (2012), §8.6. The computation of J. was first done by Kerr himself, see Melia (2017), 
page 75. The computation of &, which coincides with II in (8.126), is similar to the Schwarzschild case, since one 
may neglect the a? /r? term in A in (9.111), and many other terms drop out by symmetry. 
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Hence apart from the constant of (geodesic) motion glu, u), whose value depends on the choice 
of the affine parameter s and may be fixed to —m?, where m is the mass of the body moving 
on the geodesic, our observer carries at least as many conserved quantities as there are linearly 
independent Killing vector fields. For the Kerr metric this gives 


E := -g(u, dr); L:= g(u,ðọ), (9.124) 
interpreted as its energy and (azimuthal) angular momentum, respectively. If L = 0, then 


2 
AOE) Mp mar ay (9.125) 
dt Eoo xu 


which means that stationary observers rotate with the black hole (inertial frame dragging). 
Surprisingly, the Kerr metric leads to a fourth constant of motion along geodesics, which is 

not explicable in terms of isometries of the metric (and remains somewhat mysterious). It was 

discovered by Carter and may therefore be called C. These four constants of motion turn the 


four second-order geodesic equations into a first-order system,*’° which for m = 0 reads:*’’ 
Apt = XE — 2marL; (9.126) 
pr? = E’rt + (aE? — L? —C)r +2m((L-aE)’+C)r-a?C; (9.127) 
, L 
pÒ? =C+ G 2 a ) e026; (9.128) 
sin” 0 


Ap ọ = 2maEr + (p — 2mr) 


ne (9.129) 
Compare (9.21); one difference with the Schwarzschild case is that closed geodesic orbits are no 
longer necessarily planar. However, planar orbits do exist and include the Kerr analogues of the 
unstable photon rings at r = 3m in the Schwarzschild metric. These now arise by taking C = 0 
and @ = 2/2, in which case p? = r°, and (9.127) can be written in a way similar to (9.24), viz. 


I?-a?E? 2m(L-aE)? 


2 2 
V = E 3 V —= 
i +V(r) (r) 3 3 


; (9.130) 


cf. (9.26). Photon rings by definition have constant r, and, assuming 0 < |a| < m, solving the 
ensuing equations V (r) = E and V’(r) = 0 gives two unstable orbits with constant radii 


+ = 2m(1 + cos(ł(arccos(+ļa|/m))). (9.131) 


= 


Depending on the value of |a|/m, these fall in the range m < r- < 3m < r+ < 4m. Fora=0 


the Schwarzschild case r+ = r_ = 3m is recovered. For a > 0 the smaller orbit is prograde (i.e. 


co-rotating with the black hole), whereas the larger one is retrogade (rotating in the opposite 
direction). For C > 0 there are other spherical photon orbits off the equatorial plane 0 = 57. 
At the opposite end, one has the Kerr version of radial lightlike geodesics, which solve 

T r+ ar 
u 

at radii r where A(r) Æ 0; on the two horizons, where A(r+) = 0, these orbits are rest photons, 

which solve (9.126) - (9.129) with E = L = C = Q. As in the Schwarzschild case, these lightlike 

geodesics rule the event and Cauchy horizons in the sense of Corollary 10.17 below. 


476 See Plebariski & Krasiński (2006), §20.6, §20.7 and O’Neill (1995), chapter 4. We also consulted Teo (2003). 
477 Putting L =Q in (9.129) does not directly reproduce (9.125), since also E has to be eliminated from (9.129). 
This constant is a linear combination of f and @, from which f must be eliminated from the condition that L = 0, 
where L is also a linear combination of į and ©. See e.g. eqs. (20.104) - (20.105) in Plebariski & Krasifiski (2006). 
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9.7 Inside the Kerr black hole 


We now find coordinates in which the zeros of A are overcome, starting with the slowly rotating 
case 0 < |a| < m. The starting point is once again to introduce a radial tortoise coordinate r,(r), 
but in addition we need a new azimuthal angle 9: = 9 +A(r), where r, and A solve 


dr,(r) +a? dA(r) a 
= ; —>— =, Al 
dr A dr A ree) 
cf. (9.36). With an appropriate boundary condition these equations are solved by 
mr +. mr_ 
r = r + ———_ In |r -r| —- —— In |r - r_}; 9.134 
olor ry] — Snr (9.134) 
Asi — | (9.135) 
m2 — a? u ae 


We pass to lightlike coordinates u = v— = t — r* and v = v4 = t + rą, cf. (9.34) - (9.35). These, 
in turn, give ingoing and outgoing coordinates, where we relabel (u,v) = (v_,v+), i.e. 


(v,r,0,9+) = (v4,7,0, 9+); (u,r,0,@_) = (v_,r,0,@_). (9.136) 


Similar to (9.44) - (9.45), the Kerr metric (9.110) then becomes 


2mr Amar sin? 0 
g+=- (1 — =) d= ru. do+ + p?dO” + 2dv+dr 
i re 2asin? Odpıd 9.137 
a 4 F 2asin“ Ody dr. (9.137) 


This is regular throughout the Kerr space-time (9.116); the coordinate singularities of (9.110) 
caused by A = 0 have now been removed. Explicit computation of the geodesic equations is 
much more work now than in the Schwarzschild case, but the result is essentially the same (with 
~ ~ x), namely that for constant C+ the following formulae define radial lightlike geodesics: 


v(0),r(s) =s+C_,0(s) = 0(0),0- = 9_(0)); (9.138) 
v(0),r(s) = —s+ C+, 0(s) = 0 (0), p4 = 9+(0)), (9.139) 
which are called outgoing and ingoing, respectively, similar to the blue and the green Schwarz- 
schild lightlike geodesics drawn in the Kruskal diagram in §9.4. For r > 2m one can also see this 
in Boyer—Lindquist coordinates, where a “radial” lightlike geodesic y(s) = (t(s),r(s),@(s), @(s)) 
still has constant 0, but moving @. In terms of the constant energy E = —g( ý, d;), one finds 


E(f +a’) | 
A + 


aE 


i = ; 
A 


į = +E; 6 = 0; 0) (9.140) 
with the upper sign 7 = +E for outgoing geodesics and the lower sign 7 = —E for incoming 
ones. Both lightlike geodesics in (9.138) - (9.139) are future directed if we time-orient (Mx, gx) 
as in the Schwarzschild case, cf. (9.47), namely by declaring L = —9d, in the new coordinates 
(v,r,6,@.), which also here is a lightlike vector, to be future directed. For r > 2m this makes o, 
in the original coordinates (t, r,0, o), which is timelike in that region, also future directed, as it 
should. In the same original coordinates the vector —0, is fd timelike in the region r- < r < r4. 
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In the region r < r_ things are more involved. Most remarkably, there is a region near the 
ring where the vector field do is timelike,*’® so that one has closed timelike loops! Hence Kerr 
space-time is acausal, cf. Definition 5.28. Few people are bothered by this, though, since both 
physicists and mathematicians trust the Kerr solution only up to r_, beyond which it is supposed 
to be unstable (see §10.5 and the corresponding comments at the end of §9.5). 

If we vary the starting point (v(0),r(0), 0 (0), 9 (0)), the ingoing lightlike geodesics (simi- 
larly the outgoing ones) form a null congruence, cf. §6.3; the tangent vector field is traditionally 
called £. In terms of these, the Kerr metric assumes the amazingly simple Kerr—Schild form 


2mr 
8uv = Nv + pe nev: (9.141) 


where n is the Minkowski metric in whatever coordinates are used. This shows, in particular, 
that for m = 0 the Kerr metric is the Minkowski metric, which is not quite obvious from (9.110). 
In any case, we may now generalize Theorems 9.1 and 9.2: 


Theorem 9.3 Both horizons H+ = {(v, r,0, o) | r = rs} (where A = 0) are null hypersurfaces, 
are homeomorphic to R x S?, and are one-way membranes towards smaller values of r. 


Proof. The proof of Theorem 9.1 is easily adjusted. First, since r is constant on H+, the induced 
metric g on H+ is simply (9.137) without the two terms containing dr, with determinant 


det(@) = -p?Asin? 0. (9.142) 


This vanishes at H+ (defined as the locus where A = 0), so that H+ are null hypersurfaces. The 
other proof of this fact works as well: the normal L+ to H+ is given by 


a a 
L+ = 2(9, + O40p, ); Oro a T (9.143) 


which is lightlike on H+ (we omit the general expression for the normal to a hypersurface r = c). 
To prove the one-way membrane property, instead of (9.51) - (9.52), we now have 


glċ,ċ) <0 = i(v —asin? 0@,) +1A<0; (9.144) 
g(L,ċ) <0, & v-asin 00, > 0, (9.145) 


where we have abbreviated a lengthy expression coming from (9.137) by 
2mr\ . 4Amarsin?0 .. a : . 
A=- (1-2) = p vO: 4 FE sin 092 + p76”. 


At both horizons H+, this expression A somewhat miraculously takes the positive definite form 


2 
Ain, = ar (av — 2mr +6.) + p*6", (9.146) 


which replaces the terms r? (6? + sin? 00°) in (9.51), at r = 2m. Since A > 0 at H4, the argument 
for the Schwarzschild case still applies and hence for timelike fd curves we must have 


r<0, (9.147) 


478For 0 = In the prefactor of dø? in (9.110) equals a? +r? + 2ma? /r, which is negative for small negative r. 
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at, and therefore also near H+. The final step of the proof also applies here, except that the fd 
“rest” photons for the Kerr metric are characterized by r = r+ (and hence 7 = 0), and 0 = 0, but 


Ox = NV; v > asin? OQ. (9.148) 


Hence p, cannot be constant, as is also clear from the fact that, as for Schwarzschild, these 
photons solve Vz, L+ = 0, with L+ given by (9.143). Thus they do hover around on S?. 


The interpretation of the horizons H+ is the same as for Reissner-Nordström: AL = H a is 
the event horizon, whereas H_ = He is a Cauchy horizon. The vector field 0, also behaves 
analogously:*’? it is spacelike for r > r, and r < r_ and timelike for r- < r < r}. Observers 
that cross H4 and subsequently H_ can therefore avoid the singularity (although they cannot 
return). The singularity is timelike = locally naked, cf. §10.4, but is covered by an event horizon. 


We return to these horizons in §10.8; the main point will be that the Killing vector field 


2mar 4. a 
X= d, + Qdp; O4 := (r+) = F = z lage? 
+ 


(9.149) 


which is timelike outside H4, becomes lightlike at H4, which thence is called a Killing horizon. 
A closely related property of a Kerr black hole is its ergosphere.**° We first define the outer 
ergosurface &* (also called the stationary limit surface) and inner ergosurface &~ by 


E= = {(t,r,0,0)|r=rz(®)}; r(0):= m+ Vm? — a’ cos? 0. (9.150) 
Writing gn as — (r — r$ )(r — rz) / p°, we see that &* is where ð; changes its causal nature: 

e o, is timelike atr >r} (0); elightlike at £t; e spacelike within the ergosphere 
& ={(t,r,0,0) |rı <r<r}(0)}; (9.151) 


e lightlike again at £7; e timelike again forr<r,(0). 


In the ergosphere a massive particle cannot be at rest (as it can for r > r£), but it can still escape. 
Moreover, in the ergosphere the energy E = —g(u, d) of a particle with fd four-velocity u can 
have either sign (whereas for r > ri (0) is positive, since u must be fd). This allows the extraction 
of energy from the black hole via the so-called Penrose process. Here, a particle coming from 
infinity with necessarily positive energy Eas > 0 falls into the ergosphere, decays into a pair, one 
of positive energy Epos > 0 and one of negative energy Eneg < 0, where Epos + Eneg = Eas. If the 
positive-energy particle subsequently escapes, which is dynamically possible, an amount 


Epos — Eas = —Eneg >0 (9.152) 


of energy has been extracted from the black hole. This extraction is at the expense of its angular 
momentum: since —g(u,X) = E — QL and —g(u,X) > 0 outside H+, we have, outside H,, 


E>Q4L (9.153) 


where X is the Killing vector field (9.149). Hence L < E/O4, so if the hole absorbs a particle 
with Eneg < 0, it absorbs negative angular momentum L < Enes/N+ < 0, i.e., loses it. This 
process could continue until the ergosphere disappear and the black hole stops rotating. 


479 The vector field dg is spacelike everywhere, whereas Og is spacelike for r > 0, timelike in a certain region near 
the ring at r = 0, where it gives rise to closed timelike loops, and spacelike again for r sufficiently negative. 

480In Schwarzschild space-time the outer ergosurface coincides with the outer event horizon and hence the 
ergosphere is empty. The inner event horizon and inner ergosurface both coincide with the (pointlike) singularity. 
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Here is a picture of the various geometric structures in or near a Kerr black hole.*?! 
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Picture of important r = constant surfaces in slowly rotating Kerr space-time (in Boyer—Lindquist 
coordinates) at fixed t, shown for r > 0 (although in fact r € R). The event horizons HL where r = r+ 
are characterized by Theorem 9.3 as one-way membranes: H} is the outer event horizon of a slowly 
rotating Kerr black hole, since it is the boundary inside which future (null) infinity can no longer be 
reached. The inner event horizon H_ is a Cauchy horizon. Near the singularity 0, is timelike and 0, 
is spacelike, so it can be avoided. The outer ergosphere is the place where the timelike Killing field 0, 
switches its causal nature from being timelike at r > r} to lightlike at the outer ergosurface, to spacelike 
until one reaches the inner ergosurface, where it becomes lightlike and then timelike once more. The 
ergosphere is the region between the outer ergosurface and the outer event horizon; it is the place from 
which massive particles (or timelike observers) can no longer be at rest, but can still escape to infinity. In 
the extremal case (|a| = m > 0) both horizons H4 coalesce, since r+ =r_ =m. Furthermore, because 
re =m(1+sin®@) the ergosurfaces acquire cusps at 6 = 0 and @ = 7, at which values all three surfaces 
touch each other. Otherwise, since r} > m > r; (with equalities iff 0 = O or 0 = 7), the single horizon 
remains enclosed between the outer and inner ergosurfaces. In the fast case (|a| > m > 0) there are no 
event horizons and the two ergosurfaces have merged into a single (topological) torus with cusps." 
The same misgivings as to the original or maximally extended Schwarzschild solutions apply to 
this picture as well as to its extensions studied below; notably the instability of the inner event 
horizon and its extravagant if not crazy causal structure. However, in this case there seems to be 
no analogue of the exact Oppenheimer-Snyder solution for a rotating black hole.*** 


481 Redrawn from Visser (2006) by Edith de Jong. Explanations and formulae as in the original. 
482 See Carter (1973), 87, and Plebafski & Krasiński (2006), $20.5 for pictures of the last two cases. 
483 As a second best see e.g. Nathanail, Most, & Rezzolla (2017) for numerical simulations. 
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Black holes I: Exact solutions 


Kerr space-time is geodesically incomplete, and not just at the ring singularity. Hence (except in 
the fast case, where it is already complete) it can be extended (with all the qualifications and 
misgivings discussed at the end of $9.5). Here is the relevant Penrose diagram for 0 < |a| < m, 
which displays the nature of the (analytic) extension to an inextendible space-time: 


Above: Penrose diagram for the partly extended Kerr solution with O <a < m. The complete extension is 
an infinite tower: put the part with the green cross on top of the part with the red cross, and put the red 
part below the green part, etc. The range of r is now (—»,®) instead of (0,°), so that (at fixed time) 
r =Q is a sphere. Penrose diagrams for space-times that lack spherical symmetry (like Kerr) are less 
effective than for those who are (like Schwarzschild and Reissner-Nordström). In particular, the structure 
of the (ring) singularity does not come out very well: it is easier for a camel to go through the eye of a 
needle (i.e. cross the ring singularity) than for a rich man to enter into the kingdom of God. 


Below: Penrose diagram for the partly extended Kerr solution with 0 < |a| = m. Region II has now 


disappeared and event horizons and Cauchy horizons coincide, both simply labeled as H~ = Hg = He. 
This time the infinite tower is built by placing the entire diagram shown on top of and below itself in such 
a way that the green H` lines match, and repeating this procedure. In this way region III as shown, which 
is a black hole for region I shown, becomes a white hole for the new region I NE of the shown region III. 
Similarly, the new region III SE of the shown region I is a white hole for the latter (the distinction between 
black and white holes thus fades, or rather depends on which region the hole connects with). 
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The maximal (analytic) extensions displayed here can be determined on the basis of the (in)com- 
pleteness of radial geodesics alone, so that we may freeze 0 and @. Doing this shows that the 
situation is very similar to Reissner-Nordström: with u and v defined by (9.34) - (9.35), where 
r depends on the particular case, the Reissner-Nordström and Kerr metrics with 0 = 0 and 
either M+ (ingoing) or @_ (outgoing) constant are given by 


(a a), 


gry = —h(r)dudv; h(r) = 2 ; er = m+ vm?-e; (9.154) 
ee. 
ex = -k(r)dudv;,  k(r) = u u ai rk :=m+Vm —a?, (9.155) 


respectively; see (9.89), (9.95), (9.137), and (9.117). These horizons look analogous, but (9.154) 
has a singularity whereas (9.155) does not. Since the singularity in the actual (full) solutions are 
timelike in both cases and therefore can be avoided, this difference turns out not to matter and 
the maximal extension of Kerr space-time is essentially the same as for Reissner-Nordström. 
The only difference lies in the structure of the diamonds III: the role of the singularity at r = 0 in 
Reissner-Nordström is now played by the null infinities .4~ at r = —o. 

Apart from satisfying curiosity, the aim of the maximal extension is the following:*** 


Theorem 9.4 The maximally extended Kerr space-time (Mx,g%) for 0 < |a| < mis geodesically 
complete, except for geodesics moving into the ring singularity (9.114), which are all incomplete. 
In particular, (Mx, gi) is (smoothly) inextendible, cf. Proposition 6.2. 


For the record, the Kerr-Newman metric is obtained by changing A in the Kerr metric, see 
(9.111), by A, = r? —2mr+a*-+e?. This turns out to be a solution to the Einstein—Maxwell 
equations with axisymmetric vector potential A = —er(dt — asin? @d@) /p?, cf. (9.112). 


484 See O’ Neill (1995), Theorem 4.3.1, with a 100-page proof through an explicit classification of all geodesics. 
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10 Black holes Il: General theory 


The model-independent theory of black holes is based on techniques that were largely developed 
by Penrose in the 1960s. These techniques were initially motivated by the study of gravitational 
radiation, but they could also be applied to black holes, as e.g. in the famous paper from 1965 
for which Penrose was awarded half of the 2020 Nobel Prize for Physics (see chapter 6). 

In his wake, Hawking and others also made important contributions to the abstract study of 
black holes. Around 1970, this led to a mathematical definition of a black hole and its event 
horizon, see (10.78) and (10.79) below,*®> which is based on Penrose’s idea of null infinity and 
the associated notion of a conformal completion of space-time.**° According to this definition it 
is not the singularity but the event horizon that defines a black hole. This seems reasonable, since 
it is the event horizon that makes the hole “black”. However, Penrose’s 1965 singularity theorem, 
i.e. Theorem 6.15, does not say anything about event horizons-in fact, the theorem even makes 
event horizons unnecessary as a means for covering singularities, because the assumption of 
a Cauchy surface already suffices to make these invisible to the outside world (see especially 
Corollary 10.10 below). To overcome the discrepancy between his theorem and black holes, 
Penrose launched his great cosmic censorship conjectures, which we will discuss in detail. 

We then analyze the structure of various black hole horizons, notably event horizons, Cauchy 
horizons, and Killing horizons, and discuss the uniqueness or “no hair” theorems for black holes. 
These culminate in Penrose’s “final state conjecture” and the associated Penrose inequality. 
We close this chapter with a brief survey of the amazing laws of black hole thermodynamics. 
Although these laws can be formulated and even proved within classical GR, they can only be 
understood if quantum (field) theory is invoked. Alas, this exceeds the scope of our book. 


485 The following information is slightly adapted from the appendix of Landsman (2021), provided by Eric Curiel. 
‘Penrose (1968), p. 188, defines an event horizon as the boundary of the chronological past of a timelike curve 
(essentially the same definition, including the name, as given by Rindler 1956), and notes (p. 206) that r = 2m in 
Schwarzschild is one. The term “black hole” does not appear in that essay, nor any definition remotely like ‘the 
complement of the causal past of future null infinity’. Penrose (1969), which is one of the most important and 
visionary papers ever written about gravitational collapse and black holes, does use the term “black hole” (probably 
the first use in the academic general relativity literature, though the term itself was apparently already used in the 
early 1960s by Dicke in discussion with a popular science writer), but he always encloses it in scare quotes. In 
footnote 3 on p. 1146, Penrose almost literally gives the definition (10.79) of an ‘absolute event horizon’, written 
in words as ‘the boundary of the union of all timelike curves which escape to this external future infinity’ and 
in a formula as 9 (I7 J7), [and] he does so in the context of [weakly] asymptotically simple space-times, which 
[include] black holes! [This] seems to be the first appearance of definition (10.79) in the literature. Carter (1971b) 
does not give a formal definition of “black hole”, but he does give an informal definition of ‘domain of outer 
communication’, and says (p. 331) that ‘“black holes” [are] regions of space-time beyond the domain of outer 
communication.’ The first explicit definition of a “black hole” as the ‘connected component of the complement of 
the causal past of future null infinity’ is in Hawking (1972). This is repeated in Hawking & Ellis (1973), $9.2, and 
seems to have been standard ever since (at least in mathematical physics).’ See also footnote 626 for the historical 
connection with Hawking’s area law, which was predicated on using the absolute event horizon II (J+). 

486 When Penrose introduced conformal completions and the ensuing diagrams now named after him in GR (see 
below), these provided a completely new way of looking at boundary conditions and asymptotic flatness (Friedrich, 
2011). Since Penrose started in algebraic geometry (as a PhD student of Hodge in Cambridge, later switching to 
Todd), in finding both conformal completions and the associated diagrams he was undoubtedly influenced by the 
theory of Riemann surfaces-one of whose founders was Weyl (1913), the pioneer of the conformal approach to 
GR! Indeed, Riemann surfaces may equivalently be defined as either one-dimensional complex manifolds, or as 
two-dimensional Riemannian manifolds up to conformal equivalence. The key examples of the Riemann sphere 
and the Poincaré upper half-plane and disc D (both actually first found by Beltrami) will be reviewed in the next 
section. The Poincaré disc D lies at the basis of the famous Circle Limit woodcuts by Escher (nos. I-IV, dating 
from 1958-1960), see §4.4 for number IV, with which Penrose was well familiar. See also the Introduction. 
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Black holes Il: General theory 


10.1 Conformal completions of space-time 


Penrose’s approach to GR is typically based on conformal transformations (cf. $1.9), i.e. 
ê= 078; Buv(x) = Q(x) gun, (10.1) 


where initially Q : M — (0,00) is strictly positive.**’ The idea is that Q decrease near “infinity” 
in such a way that large g-distances become small with respect to g, with the goal of bringing 
“infinity” within a finite g-distance. To make this idea more precise, we first consider the 
Euclidean plane (IR?,g), where g is the usual flat metric. In polar coordinates, this reads 


g = dr’ +r’dp?. (10.2) 
Now consider the two-sphere S*, whose metric in the usual spherical coordinates (0, ) reads 
ê = d0? + sin? Odo’. (10.3) 
Define a diffeomorphism i : R? — S?\N, with N = (0,0,1) i.e. @ = 0, with its inverse, by 


i(r,@) := (2arctan(1/r),@); i '(0,@) = (1/tan(0/2),9); (10.4) 


the inverse 17! : ®\N > R? is the familiar stereographic projection.*®® The point is that 
grap proj p 


i: Ro Sisa conformal embedding, in the sense that as a relation on R? we have 
ig = (*O7)g, (10.5) 
a subtle variation of (10.1), where the conformal factor Q : S? > [0,°0), also defined on N, is 
O(0, 0) = 2sin’(0/2). (10.6) 


This function is strictly positive on i(IR?) = S?\N but vanishes at N, as the image under i of all 

points at infinity (i.e. r — ©). This property is crucial in keeping ¢ finite whilst g measures ever 

longer distances towards “infinity”. We call (S?, 2) a conformal compactification of (R?,g). 
A beautiful example in the same dimension is the Poincaré upper half-plane (H, g), i.e. 


= dx? + dy” 


H={x+iyeC|y>0}; 8 y 


(10.7) 


which is a model of 2d hyperbolic geometry, cf. $4.4. It is related to the Poincaré disc (D, £), 


dx? + dy? 


D= ; 2 2 1}: = 4 EBEEER 10.8 
{x+iyeC|x +y < 1}; g 2 (10.8) 
through the isometry i : H — D defined by the Cayley transform 
z—i iza o etri 
= i — 5, 10.9 
i(z) T (2) TE (10.9) 


which is also defined on the boundary dH = {x + iy € C | y = 0} and maps this onto T\{1}. 


487Tn our notation (which differs from many texts) g is the physical metric, usually solving the Einstein equations. 
#88Qur spherical coordinates (0, @) on S? are defined by (x,y,z) = (sin@cos @, sin @ sin@,cos @). In cartesian 
coordinates on both IR? and S? we have i(x,y) = (2x,2y,x7 +y? —1)/ (x? +y? +1) and i™! (x,y,z) = (x,y)/ (1—2). 
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The well-known fact that i is an isometry implies that if we now define (Ñ, £) by 
Ê := D = {x+ iy EC |x +y <1}; g= de +dy, (10.10) 


then (10.5) holds with conformal factor Q : D — [0,0°), Q(x,y) = 1 — x? —y*. Once again, 
“infinity” for (IH,g), consisting of both the x-axis (which because of the factor 1/y? in g is 
metrically speaking infinitely far from any point in H, in that it takes infinite arc length to get 
there via a geodesic) and all other points where r > © (r = \/x2 + y2), has been brought into the 
finite realm, which this time consists not of a single point, as in the case of S?, but of the circle 
oD =T = {x+iy E C | x* + y* = 1}, where Q duly vanishes. The single point 1 € T does 
absorb the entire “r = œ” infinity of IH, whereas the remainder T\{1} takes care of the x-axis. 
Thus the boundary points in D have a somewhat different status in so far as their origin in H 
is concerned, but from the point of view of the Riemannian manifold (with boundary) (D, ê) 
itself the symmetries of the model guarantee that these distinctions are lost.*°” Of course, more 
directly one may also start from (ID, @) and consider (D, ĝ) to be its conformal completion. 
Penrose magisterially adapted such examples to a space-time context, as follows:*”° 


Definition 10.1 A conformal completion of a (non-compact) space-time (M, g) is a space-time 
(M ,&), where M is a manifold with boundary,*?! along with an embedding 


i:MoM,; i(M) = int(M) := M\oM, (10.11) 


that is conformal in that i*$ = i*Q?g for some smooth positive function Q : M — R+, such that: 


Q > 0 on i(M); O=0 on ðM; dQ #0 on OM. (10.12) 

We also require that the boundary 0M consist of null infinity J (pronounced “scri”), in that 
OM 2 =F er I+ := dMNJ*(M), (10.13) 

where J* is computed in M. This defines future null infinity .4* and past null infinity.7 ~. 


In what follows we often tacitly identify M with i(M), so that @ and g are related by (10.1), 
understood to hold on M =i (M ) only, rather than all of M; indeed, g is not defined on OM. 
This definition does not fix Q, but the identification of the boundary 0M with null infinity J 
and the conditions (10.12) are well served by choosing Q such that Q(y(s)) ~ C/s for some 
constant C as s — ©, along all complete lightlike geodesics affinely parametrized by s; see § 10.2. 


48°Being an isometry, i: H — D maps geodesics of (IH,g) to geodesics of (ID,g). The former are either 
semicircles hitting the x-axis at straight angles, or straight vertical lines. The latter are segments of circles that 
intersect OD orthogonally, including the limiting case of straight lines through the origin (which need not be images 
of straight lines in H, not even when one of the endpoints happens to be 1 € T). See e.g. Beardon (1983), chapter 7. 

#0 See Penrose (1964), who-in the context of gravitational waves—adds the condition that every lightlike geodesic 
has two end-points on 0M, defining (M,g) to be asymptotically simple. This excludes black hole space-times and 
so we will not use it, following Chrusciel (2020), §3.1. For more on conformal completions see Hawking & Ellis 
(1973), 86.9, Geroch (1977), Wald (1984), $11.1, Penrose & Rindler (1986), chapter 9, Stewart (1991), chapter 3, 
Frauendiener (2000), and Valiente Kroon (2016). In connection with asymptotic flatness, see also footnote 497. 

491 See 82.6. Note that (M, &) has no corners, unlike Penrose diagrams: points such as i? and i* of such diagrams 
are not included in M, which unlike the Riemannian examples is not compact. We assume that the boundary 0M is 
smooth; whether this is really the case for specific space-times is a subtle issue, taken up in footnote 497. One may 
define spatial and timelike conformal completions that include these points (Geroch, 1977; Ashtekar, 1980). 
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Black holes Il: General theory 


10.2 Conformal completion and Penrose diagrams 


We first illustrate the idea of a conformal completion for Minkowski space-time M. It is 


convenient to move from Cartesian coordinates (x°,x!,x?,x3) to polar ones (t,r,0@,@), and 
p 


replace (t,r) by lightlike coordinates (u,v) € IR? (i.e. ð„ and ð, are lightlike vectors), defined as 


Lota) 
v:=t+r, r=1(v-u) 


so that v > u, and v = u iff r = 0. In the coordinates (u,v, 0, o), the Minkowski metric reads 


u:=t-r, t = 


; (10.14) 
; (10.15) 


n = —dudv+!(u—v)?(d0* + sin? 049°). (10.16) 
This formula implies that the curves 


s > (u(s),v(s),0(s),P(s)) = (uo, vo +5, 8, ): (10.17) 
s > (u(s),v(s),O(s),9(s)) = (u =s, vo, 8,9); (10.18) 
both defined for uo — vo < s < œ, are radial lightlike geodesics: eq. (10.17), where u is constant, 
is future directed (fd) whilst (10.18), where v is constant, is past directed (pd). In line with the 


(second) comment following Definition 10.1, we now define Q initially on M by 


Q(u,v,8,9) := (1+) 1? +) 1; (10.19) 
O(p,g,9,) = cos pcosq, (10.20) 


where for later use we have also introduced the ‘compactifying’ coordinates (p,q) defined by 


p := arctany; v= tanp, (10.21) 
q := arctan u; u = tanq, (10.22) 


where p,q € (—42,42) and p > q. This turns the original and rescaled metrics into 


= —(- Bed g). (J02 sin? 2)\. 10.2 
N= SE pcostg Ada + asin (p— 4) - (40° + sin’ Odo"); (10.23) 


fì = Qn = —dpdq+!sin*(p—q) - (d0* + sin? dg’). (10.24) 


We are now in a position to define the conformal completion (M, f)) of (IM, n) as 


M := {(p.q,.8,9) | (p.q) € (~17,37)°,p >q, (0,0) ES }US*UF; (10.25) 
I* = {(p.q,8,9) | p= 30.9 E (3,57), (0,9) €}; ee) 
I~ = {(p.q.9,0) | p E (—37,35),q = -37,(0,9) € S°}, 022 


and 7} given by (10.24), now also defined on the boundary .7 = J+ U J~, where it is perfectly 

regular. Finally, the embedding i : M <> M is given by i(u,v,@,@) = (arctany, arctan u, 0, @). 
A characteristically beautiful drawing of M in Penrose’s own hand, including the meaning 

of the (p,q, @) coordinates, with the @-coordinate suppressed, may be found on the next page. 
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The conformal completion (M, f}) of Minkowski space-time (M,n), with the @-coordinate suppressed, 
taken from Penrose (1964). Future timelike infinity I*, past timelike infinity I~, and spacelike infinity 
I? (called it, i`, and i? in the main text, following current notation), are drawn, but do not belong to 
IM. Likewise, the shells at p > in and q < 47 are not part of M, which “ends” at I° and is a rotated 
diamond without the equatorial circle and north and south poles; they are just drawn to clarify the 
meaning of the coordinates. Also, the caps above I+ and below I~ are not part of M. Metrically I? is a 
point, like I* and I”, rather than a circle. 
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In a Penrose diagram of (M,n), or indeed of any space-time (M,g) admitting a conformal 
completion, one suppresses the angles (0, ) and draws M/S? (or M/S? in such a way that 
lightlike geodesics are at +45°, as in M (this leads to some distortions in case g is not spherically 
symmetric, as e.g. the Kerr metric). This is an important tool for visualizing especially black 
holes. The points i+ and 7° defined below are typically included in such diagrams, although they 
are not part of M. The Penrose diagram of Minkowski space-time, then, is as follows: 


it 


IF 


Penrose diagram for Minkowski space-time in the (p,q) coordinates, where (p,q) € |-n/2,r/2] subject 
to p > q; the zigzag line r = 0 corresponds to p = q; it is a boundary to the diagram and as such 
“singular”, but this is an unfortunate coordinate singularity. The green line (constant q) is a fd lightlike 
geodesic and the red line (constant p) is a pd lightlike geodesic. The three corners are 


i` = (—7/2,—7/2); P = (m/2,—7/2); it = (n/2,n/2), (10.28) 
whereas the smooth boundary components are given by, cf. (10.26) - (10.27), 


I*={(p,g) |p= In,ge (-In,!m)}; (10.29) 
{(p,q) |q= -In,p € (-In,im)}. (10.30) 


e Future null infinity Z" corresponds to v = ~ at finite u, i.e. r > œ and t — ~ at fixed t —r. 
All future inextendible fd lightlike geodesics end in J`, and all its points occur in this way. 


e Past null infinity Z~ corresponds to u = —» at finite v, or r > œ and t + —% at fixed t +r. 
All past inextendible pd lightlike geodesics end in I~, and all its points occur in this way. 


e Future timelike infinity i* corresponds to u = v = ~, i.e. t — œ at finite r, and as such is the 
single endpoint of all future inextendible fd timelike geodesics. 


e Past timelike infinity i` corresponds to u =v = —~, i.e. t + —o» at finite r, and is the single 
endpoint of all past inextendible pd timelike geodesics. 


e Spacelike infinity i° corresponds to u = —œ and v = ~, i.e. r — œ at finite t, and is the single 
endpoint of all inextendible spacelike geodesics. 
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Since affinely parametrized radial lightlike geodesics with respect to f) are simply given by 


s+ (p($),4(8)) = (Po + 3,90) (future directed); (10.31) 
$+ (p(S),4(S)) = (Po»90 —$) (past directed), (10.32) 


it should be clear that all points in .?* and .7”, respectively, are reached by such geodesics, so 
that the last condition of Definition 10.1 is met. Note that for the geodesic (10.31) we have 


O((p(S),q(8)) = cos(po + $) cos(go) ~ —cos(go)(S+ po — $2), (10.33) 


as the f}-geodesic Jin question approaches .*, i.e. as p($) — 47. By an affine reparametrization 
such that ¢ = 0 when 7(S) € .4*, we may therefore achieve that near .?* we have 


AS) ~ -S. (10.34) 


From the point of view of the original metric n, by (10.17) and (10.19) the same geodesic, but 
now affinely parametrized with respect to ņ and relabeled y(s), i.e. y(s(S)) = 7(S), gives 


Q(y(s)) ~ 1/s, (10.35) 


as .7* is approached, i.e. as s — œ. The reconfirms the name “future null infinity” for J+. 

More generally, suppose = Q?g are conformally related metrics (not necessarily flat or 
even Lorentzian). Let $ 7($) be a 8-geodesic (which by convention is affinely parametrized), 
with corresponding g-geodesic s++ y(s). Then a straightforward calculation gives 


ds 1 
ds 068)? 


Hence if (10.34) holds, as can always be achieved because of (10.12), then (10.35) follows. 
We now state some important properties of .47~. Since each point in the diagram (excluding 
i+ and if) is a two-sphere SÊ, eqs. (10.29) - (10.30) give topologically and diffeomorphically, 


(10.36) 


~ 


ITS G--Rxs?. (10.37) 


However, it follows from (10.24) that the would-be two-spheres at i+ and i have zero radius and 
hence should be seen as points (once again, these do not belong to M). The wiggly line marked 
r = 0 is indeed a line (i.e. it is homeomorphic to IR); its singular appearance as a boundary in 
the Penrose diagram is a consequence of the fact that such diagrams are pictures of M/SO(3) 
rather than of M itself.4?* This can already be seen in the usual (defining) action of SO(3) on 
IR’, where the quotient IR? /SO(3) = |0,) has zero as a boundary point, corresponding to the 
fact that the stabilizer of (r,6,@) suddenly changes from SO(2) for any r > 0 to SO(3) at r = 0. 
Let us also note that %* and J~ are null hypersurfaces (see Definition 4.15); for example, 
we see from (10.29) that 7,.7 * is spanned by vectors (0/0q,0/00,0/0@), upon which (10.24) 
shows that ð / ðq is both normal and tangent to J+, and hence lightlike (likewise for .7~ with 
q > p). This implies that the metric f) is degenerate on .7*; e.g. at future null infinity J” it is 


f).c+ = 4(cos’ q) - (d0? + sin? Odg’). (10.38) 


492 Although a general metric g € T (2) M, i.e. “guv”, does not push forward to M/SO(3) under the canonical 
projection z : M — M/SO(3), its inverse g7! € T (0.2) M, i.e. “gHV”, does. As long as the SO(3)-orbits are spacelike 
two-spheres (i.e. even if rotations fail to be isometries), the pushforward Tag! e T02 (M / SO(3)) is invertible 
and its inverse g2 = (m.g~!)~! € T°) (M/SO(3)) is a Lorentzian metric of signature (-+) on M/SO(3). 
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Future (and past) null infinity of M has another desirable property, which is shared by the usual 
black hole space-times like Schwarzschild and Kerr, namely completeness. It takes some effort 
to define what this means, but the idea is that at least sufficiently far away (from the black hole, if 
any), there is no end to the future. This should be expressed technically by the fact that lightlike 
geodesics within .7* extend infinitely into the future, but unfortunately this is not the case for 
the choice of the conformal completion (M, ñ, Q) used so far:*”° lightlike geodesics within s+ 
at constant (0,@), and p = }7, simply take the form q(s) = qo + s with affine parameter s, and 
then come to a stop as g(s) — 47, which of course happens for some s < œ. 

This can be remedied by a different choice of Q and hence of f} (since n = N? is fixed), 
keeping M as it is. We give a systematic formulation in the next section, but for the moment we 
note that changing Q to N’ = œO, with w(p,q) = 1/sin(p—q), rescales (10.24) into 


dpd 
fy’ = -4—P*4 1.48? + sin? 0de. (10.39) 
sin" (p —q) 
It follows that for this metric at Z”, i.e. for p = 51, we have Tig = — tanq, so that lightlike 


eodesics y within .7* at constant (0, Q) are given, perhaps after affine reparametrization,*”* 
8 Y 8 p p p 


by 
y(s) — (p(s), a(s), 0 (s), p(s) = (37, arcsins, 6, Po). (10.40) 
These geodesics rule the null hypersurface J * (in that each point of J * lies on one of them) 
and are complete (in the usual sense of being defined for all s € R), reaching the boundary point 
it of J+, i.e. future timelike infinity, as s — ©, and i, i.e. spacelike infinity, as s + —co, 
Finally, for a new perspective on the conformal completion of M we take the 4d cylinder 


E=RxS, (10.41) 


where the 3-sphere S? = { (x1,x2,x3,x4) E€ RÍ | x? +25 +43 +x} = 1} C R* is coordinatized by 


xı = COS X; x2 = sin % cos 9; 
x3 = sin % sin 0 cos Ọ; x4 = sin % sin 0 sing, (10.42) 
where x € [0,7], 0 € [0,7], and ọ € [0,27]. This space has a Lorentzian metric,'”> given by 
§ = -dT +83 = -dT +d’ + sin’x - (dO? + sin? Odo”). (10.43) 
To relate this to Minkowski space-time (IM, 77), recall (10.21) - (10.22) and put 
T=p+g X=p-4. (10.44) 


Given the range p,q € (-!r,1r) and p > q, this yields T € (—-n,r) and x € (0,7), so that 
we may embed M into E via i : M<> M as defined before and subsequently regarding M as 
a subspace of E; the closure of i(M) C E in E is M with the corners i* and i? added. The 
embedding i: M —> E is conformal, since from (10.16) and (10.43) we find the relation 


am = 0°, (10.45) 


where Q : M — R* is given by (10.20). In conclusion, (M, ñ) is precisely the conformal 
completion of (M, n ) studied above, now embedded in the larger Lorentzian manifold (E, ĝ). 


493The next remark follows from the fact that Les = 0 for the metric (10.24), so that d’q /ds* =0. 

4°4The general solution of d?q/ds? = (tanq) - (dq/ds)? is q(s) = arcsin(cı (c2 + s)), for constants c1, cn. 

#5Historically, this space-time arose as Einstein’s static universe, which is a solution to the Einstein equations 
with cosmological constant, which indeed Einstein (1917b) introduced precisely to make the universe static. 
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Null infinity J for Minkowski space-time (M,n) is null, as the name suggests. However, 
this is not a consequence of the definition: Z can equally well be spacelike or timelike. These 
possibilities are realized, for example, in the other two Lorentzian manifolds of constant positive 
and negative curvature, viz. de Sitter space and anti-de Sitter space, respectively (see §4.4). In 
this light, (M, 7) has constant curvature and cosmological constant both equal to zero.*”° 

We start with de Sitter space dS4, defined by (4.92) with parameter p; it satisfies the Einstein 
equations Ruy = Agyy with cosmological constant A = 3/ p > 0, as follows from (4.85) with 
k = 1/p?. For simplicity we set p = 1 and coordinatize dS} = R x S? using (T, 7,0, @), where 
t € Rand (7, 0,@) € [0,7] x [0,2] x [0,27r) cover the $? part. Specifically, we have 


xo = sinhT; xı =coshtcos7Y; x2 =coshtsinycos 86; (10.46) 
x3 = cosh Tsin % sin 6 cos Ọ; x4=coshtsinysin@sing; (10.47) 
84 = —dt” + cosh? t- 88; &3 = dx’ + sin” ydQ. (10.48) 


We then compactify IR by switching to n = arcsin(1/cosht) = 2arctan (exp T) € (0,7), so that 


g+ = (sin ’n)-(-dn’ + gs). (10.49) 

The conformal factor Q(n) = sinn then turns g into $+ = Q?g already given by (10.43). We 
see that also ds! can be conformally embedded into the Einstein universe (10.41), which was 
in fact how it was discovered. In the absence of spacelike or timelike infinity, a conformal 
completion of (as}, g+ ) is given by the closure of this image, which simply extends the range of 
n to [0,2]. The boundary value 7 = 0 gives past null infinity 7”, whereas n = 7 yields J. 
For anti-de Sitter space Ads%, cf. (4.93), we have A = —3/ p? < 0, in which we again take 

p = 1. We use coordinates T € R, r > 0 or y = arctan(sinhr) € [0, 47), and (0,@) € S2, with 


xı = sinhrcos 6; x2 = sinhr sin 0 cos @; x3 = sinhr sin O sin Ọ; (10.50) 
x—1 = coshrcos T; xo = cosh rsin T; (10.51) 
g- = — cosh?’ rdt? +dr* + sinh? r- gg = (cos? %)- (—dt” + gs), (10.52) 


so that Q(%) = cos x also conformally embeds (AdS}, g—) into the Einstein universe. This time, 
null infinity is connected and timelike, corresponding to X = In, i.e. r =», See also 85.10. 


IF it 


Penrose diagrams for de Sitter space (left), where null infinity I = .%* U IT is spacelike and discon- 
nected (and x =0,% are mere coordinate singularities), and anti-de Sitter space (right), where null 
infinity J at y = 50 is timelike and connected (and x = 0 is a coordinate singularity; the vertical 
timelike direction has not been compactified and goes on forever). 


496 See also Griffiths & Podolský (2009), $4.2, 5.2, and Valiente Kroon (2016), $6.3, 6.4 for further details. 
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10.3 Asymptotic flatness at null infinity and black holes 


The example of Minkowski space-time, as well as the black hole space-times reviewed below, 
suggests the following sharpening of Definition 10.1, which captures both kinds of examples.*”’ 


Definition 10.2 A space-time (M,g) is asymptotically flat at null infinity if it has a conformal 
completion (M, @) with the following additional properties:1?* 


1. J+ = I- S R x S? diffeomorphically, cf. (10.37). 
2. The Ricci tensor Ruy of the original metric g is such that Ruy = O(O3) towards OM. 


3. The lightlike geodesics ruling I= (which by the previous clause is a null hypersurface in 
M) are complete, provided the conformal factor has been chosen such that on I= one has 


AQ=0. (10.53) 


In clause 2 and in what follows we tacitly identify M with i(M). Asking O(Q3) is on the safe 
side (one could use O(Q?*®) for 1/2 < € < 1), and implies that OP? Ryy extends by continuity 
from i(M) to zero on AM, as in Ruv(r) ~ L/r? as r — œ. The simplest way to satisfy this is to 
assume that (M,g) solves the vacuum Einstein equations Ryy = 0; in the presence of matter one 
equivalently asks that Tyy is O(O3). The third clause, in which A := g#”V, Vy, makes sense 
because of a crucial fact, noted (mutatis mutandis) without proof in Penrose (1964, 1968):4?° 


Proposition 10.3 On the boundary OM the scaling function Q satisfies the eikonal equation 
8(VO, VO) =0, (10.54) 
so that OM (more precisely: each connected component thereof) is a null hypersurface in M. 


Proof. A simple computation yields the effect of a conformal rescaling on the Ricci tensor:”" 


A 


Ruy = Ruy +O! (2VypVVO. + fuv ÂQ) —30°78(VO,VO)8yy. (10.55) 
= (VQ, VA) = £(O7?R—R)+10A0. (10.56) 


Now Ruy = O(O®) gives R = O(O?). Eq. (10.54) follows, as & is smooth and Q = 0 on 0M. 


497Penrose himself already realized that definitions of this kind, which combine smoothness of the boundary 0M 
with specific conditions at infinity, imply detailed fall-off (or ‘peeling’) properties of the Wey] tensor at infinity, 
which may not always hold. See e.g. Klainerman & Nicolò (2003), Friedrich (2004, 2018), Adamo, Newman, & 
Kozameh (2012), and Dafermos (2012). For the usual black hole solutions (i.e. Schwarzschild, Reissner-Nordström 
and Kerr) the boundary is smooth, and this is true much more generally, e.g. for stationary space-times satisfying 
standard energy conditions (Chrusciel et al., 2001). It holds even generically in a suitable topological sense 
(Chrusciel & Delay, 2002; Corvino, 2007; Paetz, 2014; Chrusciel & Paetz, 2015), so we will not worry about this. 

#98 Clause 3 is due to Geroch & Horowitz (1978). See also Horowitz (1979) and Wald (1984), §11.1. 

4997f Ruy = À8uv, then 8(VO,VO) = 44 on OM, so that VO is timelike and hence IM is spacelike if A > 0, 
and vice versa if A < 0 (Penrose, 1964, Lecture II; Penrose, 1968, p. 181). See Ashtekar, Bonga, & Kesavan (2015) 
and Ashtekar & Magnon (1984), respectively, for theory, and §10.2 above for the two simplest examples. But as the 
indisputable king of null geometry in GR, Penrose must have taken special pleasure in the case A = 0! 

500Jt is easily verified by direct computation (see e.g. Valiente Kroon, 2016, §5.2.2; Chruściel, 2020, Appendix 
H.6) that if g’ = 97g, then Ruy = Ruv — P | (2V u VvE + guvAgP) +O *(4V i PVv@ — guvg(VP,V@)). Now 
replace g’ ~> g and g ~> 8, so that = 1/Q. This gives (10.55), which is eq. (11.1.16) in Wald (1984). 
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Without clause 3, the definition (10.78) below of a black hole would be flawed (see footnote 506). 


The need for a condition on Q in order to state completeness of null infinity has been explained 
in the previous section; for otherwise even (M, N) would be a counterexample. We write 


Ñ := VO; Ñu = 3u Q; Ñ! = g#¥0,0, (10.57) 


so that NYNY = Q0 on J=. Let us first note that on .4~, i.e. for Q = 0, eq. (10.55) implies 


guvAN = 4A uM, (10.58) 


so that (10.53), or V.N H — (0 on JF, is equivalent to the seemingly stronger condition 


VN=0, (10.59) 


still on .4~ only. This condition, in turn, implies that on .% + we have the geodesic equation 


Vi =0. (10.60) 


In other words, in the “gauge” (10.53) the flow of the vector field N, restricted to .4*, consists of 
lightlike geodesics, and clause 3 requires that these particular lightlike geodesics be complete.>”! 

Towards showing that (10.53) can be satisfied by a suitable choice of the the conformal factor 
QO, we first relabel as 8, with ensuing differential operators V and A, also relabel the original Q 
as O, i.e. § = Õ?g, with N = VO, and define Q = oA, where œ : M — (0,2) is smooth and 
nonzero on .7*, for otherwise (10.61) below would make N = 0, against Definition 10.1. Still 
using the notation (10.57), a straightforward computation shows that on .4~ we have 


Nu = @N,; (10.61) 
VuN, = OVuNy + SuvNPV po. (10.62) 


Eq. (10.61) follows from Ny = 3u Q = ð (@Q) = (uw) + wð, Õ, which on .7*, where 
Õ = 0, equals Od, = ON. Eq. (10.62) follows from once (covariently) differentiating (10.55) 
and (10.56).50? On .¥*, eqs. (10.62) and (10.58), but now for the “tilde” quantities, give 


VuNy = 48uv(wÄN +4N? dpa). (10.63) 


Since NP dp differentiates along .7*, one can solve the ODE 


ÑP opo = -!AQ (10.64) 


on J= for given Õ, with any initial condition @p, and this choice of @ achieves (10.59) and 
hence (10.53). Because of Definition 10.2.1, the initial condition may be stated on some fiducial 
copy of S?, call it 3 Furthermore, as a result of the classification of compact Riemann surfaces, 


501 Completeness of curves depends on their parametrization. Geodesics are affinely parametrized by definition 
(and an affine reparametrization does not affect their (in)completeness), but a change in Q changes the metric and 
hence the notion of a geodesic with resect to ¢ (for given g), so that completeness does depend on the choice of Q. 

502 See e.g. Wald (1984), $11.1, Stewart (1991), $3.6, or Reall (2020), $5.2. for details. The extra derivative in the 
derivation of (10.62) requires better asymptotics than in Definition 10.2.2, such as Ruy = O(*) or o(Q5), se we 
assume this here. The following analysis is taken from Wald (1984), pp. 279-280, see also Reall (2020), §5.2. 
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in this case of genus zero, any Riemannian metric on S? is conformal to the standard one gg. 
We may therefore choose @ọ such that 852 = 892- We now show that on the identification 


JFERxS (10.65) 


from Definition 10.2.1, this remains true for all copies of S? in .7*. Below we take JT; the 
other case .7” involves some sign changes. We first choose coordinates (u,@,@) on .7* such 
that the point y(u) labels the solution y of (10.60) starting at (0, p) € SẸ for s = 0 with 7(0) = Ñ 
(so that u € R by Definition 10.2.3). Because of (10.12), we can also use Q as a coordinate on 
M, at least near .7*, so that we have local coordinates (O,u,@,@). Note that 
d x 8 
— =N=VN, (10.66) 
Os 
which is a lightlike vector, is tangent to J~, whereas the vector field 0 /OQ points away from it. 
Eq. (10.59), written in terms of the Christoffel symbols, then implies that on J=, i.e. for 
Q = 0, the (0, @) components of yy are independent of u. Collecting all we know, we obtain 


&o=0 = 2dudO + gy. (10.67) 


If we then introduce a-by definition-radial coordinate v := 2/ Q, the physical metric near .4~ 
is 

g = —2dudv + !(v-u)ge+--- (10.68) 
as v — œ at fixed u, where compared with (10.67) we have written (v — u)? instead of v?. This 
is because, using (10.59), as v + œ the remainder terms denotes by --- can be shown to be: 


O(v) in dO?,dp?,d0dp; O(1) in du?,dud@, dud; 
O(1/v) in dvdu,dvd0,dvdg; O(1/v°) in dv’. (10.69) 


Hence the leading terms of g near ./* are the same as in Minkowski space, cf. (10.16). 

This completes the exegesis of Definition 10.2. Having used Minkowski space-time to 
motivate this definition, let us use the Schwarzschild solution to check that it is reasonable. 
To find a conformal completion of the Schwarzschild space-time (9.46) with metric (9.45) in 
outgoing Eddington-Finkelstein coordinates (u,r,0,@), first change r to w = 1/r, which gives 


1 
g = — (2dudw — w° (1 — 2mw)dw? + d0° + sin’ 6dg’). (10.70) 
w 
Then take Q(u,w,0,@) = w, which obviously gives the unphysical metric 
ê = 2dudw — w° (1 — 2mw)dw? + d0? + sin? 0dg’, (10.71) 


defined on a manifold with boundary Ms given by adding all points with w = 0. The map 
i: Ms > Ms is then the identity, much as in the conformal compactification of the Poincaré 
disc D reviewed in §10.1. Since r — © at fixed u amounts to v > œ, this procedure only adds 
future null infinity .7*. To add past null infinity 7”, one should repeat this procedure for the 
incoming Eddington-Finkelstein coordinates (v,r,@,@), and as long as r > 2m these can be 
combined to define a conformal completion of the corresponding part of Ms, where the passage 
from outgoing to incoming coordinates is just a coordinate transformation. However, we will not 
spell out the result, since what we really are interested in is the entire region 0 < r < œ, where 
the two coordinate systems are no longer related by a coordinate transformation. 
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To define a conformal completion of all of Ms, we may enlarge it to the Kruskal space-time 
Mx, described in (U,V) coordinates. In a subtle variation on (10.21) - (10.22), we define 


V = sinh(tanP); P = arctan(arcsinhV); (10.72) 
U = sinh(tan Q); Q = arctan(arcsinh U); (10.73) 
P,Q € (-In,!Ir); P+Q€e (-Iz,1in). (10.74) 


where the last condition is necessary to keep r > 0. The conformal completion then has P,Q € 
[—42, 47], still subject to the second part of (10.74); for the upper r = 0 branch in the Kruskal 
diagram (which is part of neither Mx not Mx) corresponds to P+ Q = 47 in the Penrose diagram, 
whereas the lower r = 0 branch is P+ Q = —}7. This gives 


32m3e—"/2" cosh(tanP) cosh(tan Q) 
r l cos? Pcos? Q 
where r is regarded as a function of P and Q similar to the explanation following (9.55), now 
adding (10.72) - (10.73) to the story. For the conformal factor we now obviously take 

2 pe cos? Pcos* Q 
~ 32m? cosh(tanP) cosh(tan Q)’ 


dPdQ+r*(d0*+sin? 6d@’), (10.75) 


a 


O(P,Q) (10.76) 


which has the right asymptotics Q (r) ~ 1/r as r — ©.” The rescaled metric then becomes 


ê = —dPdQ +r O(P,Q) (d0? + sin? 049°), (10.77) 


which shows that the two-spheres S? in .7* acquire their usual metric gẹ2. A simple computation 
shows that Definition 10.2.3 is satisfied. The other two clauses are obvious from (9.46), which 
also applies to Mg, and from the fact that the Ricci tensor of g vanishes. 


it ‚+ 


r=0 l 


_ r=0 z 


Penrose diagram for Kruskal space-time Mx, in which Schwarzschild space-time Ms corresponds to 
regions I and II (excluding the SE-NW green diagonal but including the upper half of the green SW-NE 
line), see $9.4. The P-axis is at 45° and the Q-axis is at 135°, just like V and U in the Kruskal diagram, 
or (p,q) in the Minkowski case. Hence radial lightlike geodesics move parallel to these axes. The green 
lines are event horizons, whilst the blue line is a Cauchy surface. 


503To see this, use (9.56), which for r — œ gives Q? ~ tanh(tan P) tanh (tan Q) cos? Pcos? Q. Towards e.g. J+, 
where P — 47 at fixed Q, this gives Q ~ cos P. In the same regime, r ~ In V ~ In(sinh(tanP)) ~ tan P ~ 1/cosP. 
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e Future null infinity .7* has two components: on the right it has P = $m, Q € (—452,0), times 
the two-sphere S?. For the Schwarzschild space-time Ms this is all. For Kruskal space-time, %* in 
addition has the component on the left, where P € (-4n,0), Q = $n. 


e Past null infinity ~ similarly has two components: on the right it has P € (0,47), Q = -!n 


(which is all for Ms), and on the left, P = -}n, Q € (0, 57). 
e Future timelike infinity it is (P = 52,Q = 0) on the right and (0,47) on the left. 
e Past timelike infinity i~ is (P = 0,Q = —Ł 7) on the right and (—+7,0) on the left. 


e Spacelike infinity ¿° is (42,—42) on the right and (An, 42) on the left. 


We return to the general theory. The following definition is due to Hawking and Penrose.>” 


Definition 10.4 Let (M,g) be a space-time that is asymptotically flat at null infinity. The black 
hole region B* and the white hole region B~ in M are defined (and then re-expressed) by 


Bu Mae?) Mauer Mee (ee ee (10.78) 
Each connected component of B*, if not empty, is a black hole (or white hole). The boundaries 
Hic] BY 072: Hr: V Sola! =) (10.79) 


decompose into the future and past event horizons of each of the black and white holes in M. 


The hole regions B% are closed, so that Az C B™, i.e. the event horizons form part of the holes. 
For Minkowski space-time (IM, 7) with conformal completion (M, f ), as in (10.24) - (10.25), 


JH(I TNAM =J (IT) NAM =M; B= =90. (10.80) 


Thus Minkowski space-time is free of holes.°°° For Kruskal space-time (M K, 8K), the Penrose 
diagram shows that J~ (I t) Mx consists of regions I, III, and IV (without boundaries), whilst 
Ye t) MMs is just region I. In both cases the black hole B* is in region II (with boundaries). 


504Rgs. (10.72) - (10.73) are suggested by Penrose (1968), p. 209. The choice V = tan P, U = tan Q, as in Valiente 
Kroon (2016), p. 165, will not work here; his metric g y (which in our notation would be ĝ) vanishes on J=. 

505See Penrose (1969) and Hawking (1972), as further analyzed in footnote 485. The non-defining equalities in 
(10.78) and (10.79) follow from JF (J+) NM = IF (.4~) AM = IF (.4~), where the last equality follows because 
I= is null, cf. Proposition 10.3 and Lemma 4.16. To prove the first equality, we take x € J” (y) for some x € M 
and y € J+. Then we are ready if x € I~ (y), so assume x € J~ (y)\I~ (y), in which case x and y must be connected 
by a lightlike (pre)geodesic, cf. Corollary 5.14.1. By Propositions 10.3 and 6.9 one may take any point z € .?* ona 
fd lightlike geodesic through y within .7*, which can clearly be connected to x by a path that is not a lightlike (pre) 
geodesics. Hence x € I~ (z) again by Corollary 5.14.1. See also Wald (1984), p. 308. Finally, if Y C X, then dY 
consists of all x € X for which any nbhd U intersects both Y and X\Y. Hence OY = 0(X\Y). 

506 On the other hand, truncating .%+ to for example {(p,q,0,@) | p = r/2,q € (—2/2,0)} instead of (10.26), 
would turn J~ (.7*) NM into the region u < 0 and hence make the future lightcone J* (0) a fake black hole in 
IM. This would still satisfy Definition 10.1, but Definition 10.2.3 would now fail. Removing Bt=Jt (0), the 
space-time (M,g) := (M\J* (0), n ) still has a conformal completion (for example the one just described), and is 
now free of black holes, but its future null infinity is incomplete. Excluding cases like this was in fact what led 
Geroch & Horowitz (1978) to introduce clause 3 in Definition 10.2 (in slightly different form). The inextendibility 
condition proposed by Geroch (1977) fails to exclude cases like (M\J* (0), 17), but his inextendibility plus some 
regularity condition did enable him to prove uniqueness of conformal completions, a result that seems to have no 
analogue for Definition 10.2. See also Chruściel (2002), $3.4 for further comments on this issue. 
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The future event horizon H; consists of the two upper r = 2m lines. Similarly, J+ (./")NMk 
consists of regions I, II, and III (without boundaries), and J* (I =) Ms is regions I and II, i.e., 
all of Ms. The white hole B~ in Mx is region IV (with boundaries), with past event horizon Hg 
consisting of the two lower green r = 2m lines. See also the Penrose diagram below (10.77). 

To close this section, we discuss the fact that though mathematically sweet, Definition (10.79) 
of a horizon, based on the idealizations in Definition 10.2, is not entirely uncontroversial: 


This definition depends on the whole future behaviour of the solution (...) one cannot 
find where the event horizon is without solving the Cauchy problem for the whole future 
development of the [partial Cauchy] surface.’ (Hawking & Ellis, 1973, p. 319) 


This definition is global in a strong and straightforward sense: the idea that nothing can 
escape the interior of a black hole once it enters makes implicit reference to all future time- 
the thing can never escape no matter how long it tries.°°’ Thus, in order to know the location 
of the event horizon in spacetime, one must know the entire structure of the spacetime, 
from start to finish, so to speak, and all the way out to infinity. As a consequence, no local 
measurements one can make can ever determine the location of an event horizon. (...) 
Another disturbing property of the event horizon, arising from its global nature, is that it is 
prescient. Where I locate the horizon today depends on what I throw in it tomorrow-which 
future-directed possible paths of particles and light rays can escape to infinity starting today 
depends on where the horizon will be tomorrow, and so that information must already be 
accounted for today. Physicists find this feature even more troubling. (Curiel, 2019b, p. 29) 


[The notion of a horizon] is probably very useless, because it assumes we can compute the 
future of real black holes, and we cannot. (Carlo Rovelli, quoted in Curiel, 2019b, p. 30) 


I have no idea why there should be any controversy of any kind about the definition of a 
black hole. There is a precise, clear definition in the context of asymptotically flat spacetimes 
(...) I don’t see this as any different than what occurs everywhere else in physics, where 
one can give precise definitions for idealized cases but these are not achievable/measurable 
in the real world. (Bob Wald, quoted in Curiel, 2019b, p. 32) 


However, the disagreement may not be so bad, since two kinds of idealizations are involved 
here: (i) The ability to know an entire space-time (M ; g), either from initial data or by direct 
construction, and (ii) The construction of (null) infinity from which the black hole and its event 


horizon are defined. Rovelli’s comment seems to apply to the first point and Wald’s to the second. 


On the other hand, the second point is predicated on the first, which remains unresolved, except 
from the point of view of a Laplacian demon. Definition (10.79) is an axiomatic approach to 
black holes, subject to Russell ’s famous charge that ‘The method of “postulating” what we want 
has many advantages; they are the same as the advantages of theft over honest toil? However, 
nothing is wrong with an axiomatic approach as long as one can find representative and realistic 
models for the axioms (or definitions) that show that they are reasonable. This is certainly the 
case here, where the known exact black hole solutions validate all definitions. 

In any case, discussions like this have led to alternative, more local definitions of event 
horizons, of which apparent, dynamical, isolated, and naive horizons are examples.”0 For 
stationary black holes one may even avoid Y at no cost in defining event horizons, see $10.9. 


5070r, as Ashtekar & Galloway (2005), p. 2, insightfully write in an article on dynamical horizons: ‘[H, > ] is the 
boundary of an interior spacetime region from which causal signals can never be sent to the asymptotic observers, 
no matter how long they are prepared to wait. The region is therefore “black” in an absolute sense.’ 

508 See e.g. Hawking & Ellis (1973), §9.2, Ashtekar & Krishnan (2004), Booth (2005), Chrusciel (2002; 2020, 
§8.4), Hayward (2013), and Faraoni (2015). See also §10.11 for a short introduction to apparent horizons. 
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10.4 Cosmic censorship a la Penrose 


A key issue in the theory of black holes is Penrose’s cosmic censorship conjecture, which he 
first raised in 1969 and which underwent several refinements, bifurcations, and reformulations 
since then.°’? One way to understand this development is to compare the actual achievement of 
Penrose’s 1965 incompleteness theorem with its intended goal. Quoting Penrose’s 2020 Physics 
Nobel Prize citation, this goal was to prove that ‘black hole formation is a robust prediction of 
the general theory of relativity’ (see also chapter 6). However, what the theorem proved was that 
the conjunction of (i) a non-compact Cauchy surface; (ii) the null energy condition, and (iii) the 
presence of a trapped surface, implies lightlike geodesics incompleteness (cf. Theorem 6.15). In 
the light of the analysis in the preamble to chapter 6, it is clear that two things were missing: 


1. To get closer to a curvature singularity as the source of lightlike geodesic incompleteness, 
one should get rid of the possibility of extendibility of the space-time in question. 


2. Although an event horizon is what makes black holes black, this concept plays no role 
whatsoever in the theorem, and hence it should be involved one way or the other.>! 


Briefly, the first point leads to strong cosmic censorship, whereas the second leads to weak 
cosmic censorship.>!' The latter came first. Here is Penrose’s original formulation: 


We are thus presented with what is perhaps the most fundamental question of general- 
relativistic collapse theory, namely: does there exist a “cosmic censor” who forbids the 
appearance of naked singularities, clothing each one in an absolute event horizon? In one 
sense, a “cosmic censor” can be shown not to exist. For it follows from a theorem of 
Hawking that the “big bang” singularity is, in principle, observable. But it is not known 
whether singularities observable from outside will ever arise in a generic collapse which 
starts off from a perfectly reasonable nonsingular initial state. (Penrose, 1969, p. 1162) 


Or, in Penrose (1979), p. 618, with an emphasis on nitial data: 


A system which evolves, according to classical general relativity with reasonable equations 
of state, from generic non-singular initial data on a suitable Cauchy hypersurface, does not 
develop any spacetime singularity which is visible from infinity. (Penrose, 1979, p. 618). 


Visibility from infinity, then, is blocked by an event horizon. However, Penrose argued: 


It seems to me to be comparatively unimportant whether the observer himself can escape to 
infinity. Classical general relativity is a scale-invariant theory, so if locally naked singularities 
occur on a very tiny scale, they should also, in principle, occur on a very large scale in 
which a ‘trapped’ observer could have days or even years to ponder upon the implications of 
the uncertainties introduced by the observations of such a singularity. (...) It would seem, 
therefore, that if cosmic censorship is a principle of Nature, it should be formulated in such 
a way as to preclude such locally naked singularities. (Penrose, 1979, p. 619) 


50° See Earman (1995), chapter 2, for a complete survey up to that point, and Joshi (1993, 2007) for case studies. 

510 Yet in one of the most important papers about black holes in observational astronomy (Event Horizon Telescope 
Collaboration, 2019a), which would have deserved to share the 2020 Physics Nobel Prize with Penrose, the authors 
write: ‘A defining feature of black holes is their event horizon, a one-way causal boundary in spacetime from which 
not even light can escape (Schwarzschild 1916). The production of black holes is generic in GR (Penrose 1965). 

5llIn 1965 Penrose knew the concept of future null infinity .7*, since he had conceived it himself at least a 
year earlier (Penrose, 1964). We now know that this leads to a clean definition of black holes and their (absolute) 
event horizons (see $10.3), but at the time .7* was apparently supposed to be relevant mainly for the study of 
gravitational radiation. Its application to black holes had to wait until at least Penrose (1969). See footnote 485. 
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This preclusion, then, is Penrose’s (original) idea of strong cosmic censorship. Although 
historically the strong version postdated the weak one, it is conceptually easier to start defining 
strong cosmic censorship rigorously, and then move to the former as a modification thereof.”!? 
The problem is to define what it means for a signal to emanate from a singularity, since 
the latter is not part of space-time. To resolve this, we recall that in the context of the singu- 
larity/incompleteness theorems, singularities were tentatively captured by incomplete causal 
geodesics in space-time, and hence one would expect Penrose to use these here, too. However, 
instead of incomplete causal geodesics he now uses inextendible causal curves.>!° It turns out 
that the change from ‘causal’ to ‘timelike’ does not matter,>!* but the change from geodesics to 
curves, which is required by the proof of Penrose’s Theorem 10.6 below,°'° is substantial.°'® 
Now the reasoning that leads to a definition of strong cosmic censorship is as follows. 


1. If (M,g) is strongly causal,?!” then I~ (x) = I" (x’) iff x = x’. This allows us to exchange 
properties of points x for properties of their timelike pasts /” (x), which will be crucial. 


2. By definition, we have z < x, i.e. z € I” (x), iff some future-directed timelike curve from z 
can reach x. In this case z can signal to x, or influence x, and since x can “see” z, we say 
that z is locally naked for x. This, then, is equivalent to the property 


I-(z) CI (x). (10.81) 
3. If zis the endpoint of some f-d timelike curve c, then I" (z) C I” (x) iff I" (c) CI (x). 


4. The point is that this also works if c has no endpoint and hence defines a “singularity”: 
this singularity is deemed locally naked for x iff I” (c) C I” (x). In summary: 


512The following discussion is based on Penrose (1979), which we simplify by removing the TIPs of Geroch, 
Kronheimer, & Penrose (1972) from the discussion, including the proof of Theorem 10.6. Penrose’s timelike curves 
are smooth (Penrose, 1972, pp. 2-3), whereas we work with continuous causal curves, see Definition 5.20 in §5.6. 

513(In)completeness of non-geodesic curves depends on the parametrization. If continuous causal curves are 
parametrized by arc length as measured by an auxiliary complete Riemannian metric (see footnote 512), then 
any inextendible curve has infinite arc length, see Lemma 5.22. Also, recall that (affinely parametrized) timelike 
geodesics are incomplete iff they are inextendible and have finite parameter length, cf. Proposition 5.19. 

514Tn the light of the analysis below, this follows from Theorem (2.3) in Geroch, Kronheimer, & Penrose (1972). 
Causal geodesics would presumably lead to some weaker causality condition than global hyperbolicity. 

515]t is the second (‘converse’) part of the proof of Theorem 10.6 below that does not work for causal geodesics 
instead of curves, since the curve c constructed there is not necessarily a geodesic. This goes back to the definition 
of domains of dependence and Cauchy surfaces in terms of causal curves rather than geodesics. 

516 To bridge the gap with incomplete geodesics, note that, heuristically, an inextendible causal curve may either 
go off to infinity, or hover around in a compact set, or stop at some boundary of an extendible space-time, or hit 
something like a curvature singularity. The first possibility is not excluded a priori, but seems hard to combine 
with the key condition (10.82) below, and according to Penrose, in space-times that are asymptotically flat at null 
infinity this is even impossible. See Penrose (1979), p. 623. For example, condition (10.82) below may be satisfied 
in anti-de Sitter space, which is hardly singular. But anti-de Sitter space has a negative cosmological constant 
with timelike future null infinity. Secondly, hovering around in a compact set is impossible in a strongly causal 
space-time. If we therefore assume that our space-time is both strongly causal and asymptotically flat at null infinity, 
as is typically the case for black hole space-times, then we are left with inextendible causal curves that may either 
lead to the edge of an extendible space-time or crash into a singularity. Therefore, under the stated assumptions 
the situation only differs from the one in the singularity theorems in that our curves are not necessarily geodesics. 
Asymptotic flatness is not assumed in either Definition 10.5 or Theorem 10.6, so that global hyperbolicity excludes 
the local nakedness of even more singularities than those described by the incompleteness theorems. 

5!7The following property makes a space-time past distinguishing, which on the causal ladder is well below strong 
causality (Minguzzi, 2019, chapter 4, Definition 4.46). However, through its implication of non-total imprisonment, 
strong causality is also used through invocation of Theorem 2.53 in Minguzzi (2019) in our proof of Theorem 10.6. 
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“physcially reasonable” ] space-times do not contain locally naked singularities. 


Definition 10.5 A strongly causal space-time (M,g) contains a locally naked singularity if 
there is a future-directed future-inextendible causal curve c in M, and a point x € M, such that 


I-(c) CI-(x). (10.82) 


Penrose’s strong cosmic censorship conjecture states that “generic” [in his own words: 
518 


The curve c defines or represents this “singularity”, and x € M is in its chronological future.>!” 


For example, in Minkowski space-time, take z € J” (x) and remove z. Then any fd future- 
inextendible timelike curve c whose endpoint would have been z satisfies (10.82). Of course, it 
should be defined precisely what “generic” means, lest these conjectures turn into a definition 
of genericity! Penrose did not do this, but we will return to this point in §10.5. The following 
theorem, due to Penrose (1979), characterizes—or redefines-his idea of strong cosmic censorship. 


Theorem 10.6 A strongly causal space-time has no locally naked singularities, i.e. satisfies 
strong cosmic censorship, iff it is globally hyperbolic. 


Proof >’? We prove the inference from global hyperbolicity to the absence of locally naked 
singularities by contradiction.”?! Suppose that (M,g) is globally hyperbolic and that (10.82) 
holds for some c and x. Take y on c and then a future-directed sequence (yn) of points on c, with 
yo = y. Because of (10.82) this sequence lies in J+ (y) AJT (x), which is compact by assumption. 
Hence (yn) has a limit point z in J+ (y) AJ- (x). Now define curves (7,) as the segments of c 
from y to yn. By Lemma 5.26, these curves have a uniform limit y. Its arc length (as measured by 
an auxiliary complete Riemannian metric, see footnote 512) is, on the one hand, infinite (since c 
is endless and hence has infinite arc length, which is approached as the y, move up along c). But 
on the other hand it is finite, since y must end at z (and fd continuous causal curves have finite 
arc length iff they have an endpoint). Hence (10.82) cannot be true and the inference is ready. 
The (contrapositive) proof of the converse implication relies on the following lemma:”” 


Lemma 10.7 Let (M,g) be a space-time, let S C M be closed and achronal, and let x,y € M. 


1. Ify € int(D~(S)), then Jt (y) AJ (S) is compact. In particular, taking S = OI (x) 
and assuming y € I~ (x), it follows that J* (y) QJ- (x) is compact. 


2. We have int(D~(S)) =I~(S)NI* (D7 (S)). 


518 One may also use past-directed inextendible causal curves, replacing I~ (-) by /*(-), etc. For strong cosmic 
censorship this definition is equivalent to the given one, as follows from Theorem 10.6 below. 

519 One may also regard c or rather I~ (c) as an ideal point of space-time. Assuming strong causality, Geroch, 
Kronheimer, & Penrose (1972) and in their wake Hawking & Ellis (1973), $6.8, show that both real points and ideal 
points of M correspond to subsets U C M that are: (i) open, (ii), past sets, i.e. Z7 (U) C U, and (iii) indecomposable, 
in that U # U1 UU? where U, and Us have properties (i) and (ii) and are neither empty nor equal to U. Such sets are 
called IP (for Indecomposable Past set), and those that are not of the form U = I” (x) for some x € M are TIPs (for 
Terminal IPs); these TIPs are U = I” (c) for some future-inextendible timelike curve c. 

520A heuristic argument for part 2 of the theorem is that a locally naked singularity, represented by c as in 
Definition 10.5, will not reach any wannabe Cauchy-surface È in /* (x), since it crashes at the singularity lying in 
the past of x. Conversely, if no Cauchy surface exists then one can construct such a curve c. See also §10.5 below. 

521 Penrose (1979) gives a considerably more complicated argument, in terms of his TIP’s (which we avoid). 

522We follow Penrose (1979), p. 624. The lemma combines Propositions 5.20 and 5.5 (h) in Penrose (1972). 
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The first point is a variation on Proposition 5.39, in which D(S) is replaced by D~(S). The 
specification follows from the property J” (ðI~ (x)) = J (x). The second point is a simple 
consequence of the definitions in question. To prove the converse direction of Theorem 10.6 
(contrapositively), assume that (M, g) is not globally hyperbolic. Then, under the assumption of 
strong causality, there are x,y for which J~ (x) NJ* (y) is not compact (cf. Definition 5.27, where 
strong causality implies non-imprisoning). In view of (5.96) we may assume that y € I” (x). Part 1 
of Lemma 10.7 gives y ¢ int(D” (ðI (x))). Part 2 gives some y' € I” (x) with y' € D7 (OT (x)), 
so that, by definition of D~, there exists some fd future-inextendible curve c from y’ that avoids 
OI (x). Since y” € I~ (x), this curves does lie in J~ (x), and hence (10.82) holds.°** 


We now (ahistorically) move from strong to weak cosmic censorship. If our space-time 
(M,g), so far merely assumed strongly causal, is also asymptotically flat at null infinity, then 
Definition 10.5 can be modified by requiring x € I (.4*) =I (.SF)NM=J"(.F*F)NM. This 
means that the “singularity” represented by the inextendible curve c is hidden from observers in 
I~ (47), although it may be “naked” to observers in the black hole region BT, cf. (10.78). 


Definition 10.8 7. A strongly causal space-time (M,g) that is asymptotically flat at null 
infinity contains a naked singularity if there is a future-directed future-inextendible 
causal curve c in M and a point x € I (J~) such that I (c) CI (x). 


2. The weak cosmic censorship conjecture states that no strongly causal space-time arising 
from “generic” smooth and complete initial data contains a naked singularity. 


A slight change in the proof of Theorem 10.6, involving a case distinction on c, yields: 


Theorem 10.9 Weak cosmic censorship holds iff I (I>) is globally hyperbolic. 


In our formulation, strong cosmic censorship implies weak cosmic censorship.’ A slight 
variation of Definition 10.8, which is relevant for black hole thermodynamics and uniqueness 
theorems (see §10.10 - §10.12), has the same virtue. The domain of outer communication 
DOC(. ) of a space-time that is asymptotically flat at null infinity is defined by 


DOC(.Y) := (IT) NAITI). (10.83) 


If we now replace the condition x € J” (.7*) in Definition 10.8 by x € DOC(.7), we obtain 
a slightly different formulation of weak cosmic censorship; let us call it DOC-WCC. On this 
definition, DOC-WCC at least implies that DOC(./) is globally hyperbolic (there is no “iff”). 
Of course, by changing ‘x € J~(.%*)’ in Definition 10.8 to ‘x € R for some causally 
interesting region RC M, or even R C M, one can engineer the definition of weak cosmic 
censorship in any desired way,”-> preferably with some corresponding version of Theorem 10.9. 


523 All this can be checked in the Minkowskian example following Definition 10.5, where, assuming z € I* (y), the 
removal of z ruins compactness of J+ (y) NJ (x) and hence global hyperbolicity. The existence of c is trivial. 

524This follows from the definitions, but it also follows from Theorems 10.6 and 10.9 and the observation that if 
(M,g) is globally hyperbolic, then so is I (J+): if x € J+ (y) for x,y € I~ (J), then JT (Y) AJ (x) CIN (I+). 

525 For example, Tipler, Clarke & Ellis (1980), p. 176, define weak cosmic censorship as global hyperbolicity of 
J~ (J+) CM, which does not follow from global hyperbolicity of 7° (J+) C M. See e.g. Chruściel, & Galloway 
(2019). Penrose’s (1979) prose suggests replacing I (.4*) by J7 (JF) NJ (XZ), where X is some wannabe 
Cauchy surface in M. Hawking & Ellis, 1973, p. 312, say that (M, g) is future asymptotically predictable from 
> if the conformal completion M of M contains an open subset V C M such that J~(.4+) NM CV and (V,g) is 
globally hyperbolic. This is equivalent to the definition we just attributed to Penrose only under further regularity 
assumptions (Krölak, 1986, Lemma 2.10). See also Wald (1984), $12.1 and Chrusciel (2020), §3.5.1. 
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10.5 Cosmic censorship in the initial value (PDE) formulation 


Let us pause to take stock, especially with regard to the two points laid out at the beginning of 
$10.4 that were claimed to be missing from Penrose’s incompleteness/singularity theorem 6.15. 
As to the second point, Theorem 6.15 has the following remarkable extension: 


Corollary 10.10 Under the assumptions of Penrose’s Theorem 6.15, the “singularity” defined 
by the ensuing incomplete lightlike geodesic cannot be locally naked (let alone naked). 


This follows from Theorems 10.6 and 5.34, simply because Theorem 6.15 assumes a Cauchy 
surface. Hence it does not even seem necessary to postulate the existence of an event horizon 
that covers the singularity! However, the standard black hole space-times studied in chapter 9 do 
have event horizons, which, acting as one-way membranes, accomplish more than dressing the 
singularity. Thus Corollary 10.10 merely confirms the mismatch between Theorem 6.15 and the 
physical concept of a black hole, which is predicated on having an event horizon; the latter still 
needs to be postulated on top of the assumptions of Theorem 6.15. It therefore seems that weak 
cosmic censorship cannot be settled at the axiomatic level and has to be dealt with by means of 
(preferably “generic”) case studies of black hole formation, so far with mixed results.°° 

For the first point in §10.4, at first sight strong cosmic censorship as in Definition 10.5 
seems to have nothing to do with inextendibility. But in fact—and this must have been clear to 
Penrose all along—it has everything to do with it! But let us first cause further confusion by 
noting that in the light of Theorem 10.6, Penrose’s Definition 10.5 of strong cosmic censorship 
is inappropriate if one adheres to the PDE approach to GR and especially to its second principle 
laid out in §7.6, namely that all valid questions in GR are questions about the MGHD (or maximal 
Cauchy development) (M,g,1) of initial data (X,&,k). Indeed, a MGHD is always globally 
hyperbolic, and hence strong cosmic censorship is automatic. However, this should disqualify 
neither Definition 10.5 nor the notion of a MGHD; it is their combination that seems a mismatch. 

To overcome this—whilst admitting that there is no crystal-clear logical path from Penrose’s 
formulationto the PDE version below-we introduce the following variation of Definition 7.4:°7’ 


Definition 10.11 A development of initial data (2,8,k)-satisfying the vacuum constraints-is 
a triple (M,g,1), where (M,g) is a space-time solving the vacuum Einstein equations, and 
1:3 — M is an embedding such that ı*g = $ and 1(X) has extrinsic curvature k. 


526Christodoulou (1999b) proved his own-PDE-version of the weak cosmic censorship conjecture (Definition 
10.13.2) for the spherically symmetric gravitational collapse of a scalar field, but on the basis of genericity conditions 
whose relevance has been questioned (Gundlach & Martin-Garcia, 2007, §3.4). See also the references in footnote 
297. More generally, the status of weak cosmic censorship seems mixed also in earlier heuristic formulations in 
terms of an event horizon; see e.g. Joshi (1993, 2007), Krölak (1999), and Ong (2020). 

527 This theory is due to Chruściel (1992). Earlier, Moncrief & Eardley (1981), p. 889, proposed an ‘(informally 
stated) global existence conjecture’ stating that “Every asymptotically flat initial data set with tr K = 0 may be 
evolved to arbitrarily large times”, adding that its proof would ‘in essence prove the [weak] cosmic censorship 
conjecture for asymptotically flat space-times’. For initial data given on a compact Cauchy surface they propose 
something similar, and in doing so they opened the door to regarding cosmic censorship as a global existence 
problem for the Einstein equations. In this spirit, Moncrief (1981), p. 88, paraphrases Penrose’s strong cosmic 
censorship as expressed by Theorem 10.6 as: ‘the maximal Cauchy development of a generic initial data set is 
inextendible.’ Similarly, Chruściel, Isenberg, & Moncrief (1990) open their abstract as follows: ‘The strong cosmic 
censorship conjecture states that ‘most’ spacetimes developed as solutions of Einstein’s equations from prescribed 
initial data cannot be extended outside of their maximal domains of dependence.’ In §3 they further specify ‘most’ 
in terms of open and dense subsets in the space of initial data. 
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This may be adapted to the case with matter. The difference between a development and a Cauchy 
development is that in the former ı (%) is no longer required to be Cauchy surface in M, so that 
(M, g) is not necessarily globally hyperbolic. Such a development is called maximal if there is 
no extension (M’,g’) that also satisfies the vacuum Einstein equations, and one proves existence 
of maximal developments (but not uniqueness up to isometry, as in the globally hyperbolic case). 
We now apply Penrose’s strong cosmic censorship to such a maximal development, i.e. 
require it to be globally hyperbolic. The connection with inextendibility is then easily made: 


Proposition 10.12 The maximal development of given initial data is globally hyperbolic iff the 
MGHD of these data is is inextendible as a solution to the vacuum Einstein equations. 


Of course, “the” MGHD is only defined up to isometry, see Theorem 7.10, so be aware. 

Proof. As explained in §7.6, the set of isometry classes [M ‚8; i] of Cauchy developments 
(M,g,i) of given initial data (X,g,k) is partially ordered, and by Theorem 7.12 the MGHD 
[M,,gr,1,| is its top element. Hence if some maximal development (Min, 8m,tn) is globally 
hyperbolic then [Mjn, gm» Im] < [|M;, gr, r]. On the other hand, since (M,, g+, 4) is a solution and 
(Mm, 2m; 1m) is maximal also the converse holds, so (Mm, 8m, Im) = (Mr, 81,1). 


Adding regularity conditions on the extensions,°~® this would be a meaningful and natural PDE 
version of strong cosmic censorship. Indeed, the requirement that also the (strongly censored) 
extension satisfies the vacuum Einstein equations provides these regularity conditions, at least 
up to a point: one may either go for extensions in which the metric is C?, i.e. the borderline case 
where Einstein equations make sense classically, or allow C? metrics as long as the associated 
Christoffel symbols are locally L?, which is the least regular case in which the metric can still be 
defined as a weak solution to Einstein’s equations.°”’ Indeed, a weak solution of the vacuum 
Einstein equations is a metric g for which for all compactly supported X,Y € X(M), 


f dx — det(g(x))Ruv (OXE QY" (x) = 0. (10.84) 


Partial integration shows that this is well defined iff the Pa are locally L7. 
However, in the following PDE definition of strong cosmic censorship the extension is not 
required to satisfy the Einstein equations! For convenience also state the weak PDE version. 


Definition 10.13 e The PDE-strong cosmic censorship conjecture states that the MGHD 
of “generic” complete initial data is inextendible (in a regularity class to be specified). 


e The PDE-weak cosmic censorship conjecture states that if “generic” complete initial 
data have a MGHD that is asymptotically flat at null infinity (and hence admits a conformal 
completion to begin with), then future null infinity .4* of this MGHD is complete. 


528 Chrusciel, Isenberg, & Moncrief (1990) and Chrusciel & Isenberg (1993) consider smooth extensions. 

529 See Geroch & Traschen (1987), Christodoulou, (2009), p. 9, and Luk (2017), footnote 1. This simple observation 
should not be confused with the very deep result that having the Ricci tensor in L? is sufficient for the (vacuum) 
Einstein equations to be weakly solvable at least locally (Klainerman, Rodnianski, & Szeftel, 2015). 

530See Definition 10.2.3. Christodoulou (1999a) reformulates this definition of weak cosmic censorship in such 
a way that the idealization .7* no longer occurs. Let (Z,h,K) be asymptotically flat initial data for the Einstein 
equations (satisfying the constraints), with MGHD (M,g,i). Christodoulou (1999a) then defines (M, g) to have 
“complete future null infinity” iff for any s > 0 there exists a region By C B C È such that 9D* (B), which is ruled 
by lightlike geodesics, has the property that each lightlike geodesic starting in 0J* (Bo) N 0D* (B) can be future 
extended beyond parameter value s. Here D*(B) is the future domain of dependence of B, and each lightlike 
geodesic in question is supposed to have tangent vector L = T — N, where T is the fd unit normal to X in M and N is 
the outward unit normal to OB in &. See Christodoulou & Klainerman (1993) for background on these constructions. 
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In view of the path from Penrose to PDE described above, part 1 of this definition is stronger 
than the application of Penrose’s definition to the maximal development of the given initial 
data in the sense of Definition 10.11. As acompromise, one might pose the conjecture relative 
to extensions that satisfy some curvature condition, such as the timelike and/or null curvature 
conditions assumed in the singularity theorems of Hawking and/or Penrose, respectively. 

Although it would make sense in general, in practice the strong conjecture is posed for either 
non-compact Cauchy surfaces % with asymptotically flat initial data, or for compact È} (also 
called the ‘cosmological’ case).?*! Except in relatively simple cases like Minkowski space-time, 
in the asymptotically flat case the validity of PDE-strong cosmic censorship turns out to be very 
sensitive to the precise regularity of the extension.” Sensitivity to the precise formulation of the 
genericity conditions has not been much discussed in the literature,’ but already the simplest 
examples (see §10.6) show that such conditions are necessary: strong cosmic censorship in any 
version already fails for all parameter values of the Kerr metric (as lone as a £ 0 and m #0), 
and even for the Reissner-Nordström metric (again with e 4 0 and m Æ 0); in the exact black 
hole solutions discussed in this book it only holds (in both versions) for the Kruskal space-time. 

This made it especially courageous—or some might say reckless—of Penrose to formulate the 
strong conjecture. But of course he would not have done so without good arguments against the 
counterexamples. His key observation, one of his most prophetic insights, was first published in 
1968 (i.e. before even the weak version, which predated the strong one, was formulated), viz.?** 


There is a further difficulty confronting our observer who tries to cross He . As he looks 
out at the universe he is “leaving behind,” he sees, in one final flash, as he crosses HÈ , the 
entire later history of the rest of his “old universe.” If, for example, an unlimited amount of 
matter eventually falls into the star then presumably he will be confronted with an infinite 
density of matter along “HÈ ”. Even if only a finite amount of matter falls in, it may not 
be possible in generic situations to avoid a curvature singularity in place of He . This is at 
present an open question. But it may be, that the place to look for curvature singularities is 
in this region rather than (or as well as?) at the “center.” (Penrose, 1968, p. 222) 


Our contention in this note is that if the initial data is generically perturbed then the Cauchy 
horizon does not survive as a non-singular hypersurface. It is strongly implied that instead, 
genuine space-time singularities will appear along the region which would otherwise have 
been the Cauchy horizon. (Simpson & Penrose, 1973, p. 184) 


Note that from Penrose’s point of view He is a (future) Cauchy horizon relative to some wannabe 
Cauchy surface & inside some “large” analytically (possibly maximally) extended space-time. 


531 See Ringström (2009) and Doboszewski (2017) for reviews of the cosmological case, where the strong (PDE) 
conjecture seems to hold “generically” in regularity C? (and of course higher). 

532 Proposition 6.2 is concerned with smooth extensions, but the version in Chrusciel (2020), i.e. Proposition 4.4.3, 
originating in Chrusciel & Costa (2008), even works for C* extensions with k > 2. Hence a proof of inextendibility 
based on an explicit classification of all causal or even just all timelike geodesics, works for any k > 2. The 
inextendibility of both Minkowski space-time and Kruskal space-time can be proved in this way; see e.g. Corollary 
13.37 in O’ Neill (1983). Minkowski space-time even turns out to be inextendible in C°, which is a far more difficult 
result (Sbierski, 2018ab). So here the validity of PDE-strong cosmic censorship is independent of the regularity. 
However, for two-ended asymptotically flat data for the spherically symmetric Einstein-Maxwell-scalar field system 
(to which the conjecture, so far discussed for the vacuum case, can be extended in the obvious way), the conjecture 
fails in C®, i.e. the MGHD is extendible with a C? metric, but it holds in C!, in that the metric of the extension fails 
to be C! (Dafermos, 2003, 2005). The situation for the Kerr metric is similar (Dafermos & Luk, 2017). 

533 Specific papers that clearly state such conditions include Dafermos (2003) and Luk & Oh (2019a), §3. 

534Here He is the Cauchy horizon, which in the original text is denoted by H4 (.#). This is the only change. 
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From the PDE point of view, on the other hand, HÈ is the boundary of the MGHD of the corre- 
sponding initial data on X. Either way, this “blueshift instability” of HÈ has been confirmed in a 
large number of studies and hence remains the key to proofs of PDE-strong cosmic censorship.”> 

Failure of strong cosmic censorship is often taken to imply a failure of determinism in GR.” 
This is true in a specific sense, which is slightly different for the Penrosian and the PDE versions, 
but in both cases rests on the idea that the (classical) world—including the gravitational field 
itself, or at least its physical degrees of freedom-is governed by hyperbolic partial differential 
equations whose initial data should be given on a hypersurface & and whose solutions should be 
thereby determined on its domain of dependence D(X).°*’ If our space-time (M,g) is globally 
hyperbolic, then it has a Cauchy surface & such that D(X) = M (see Proposition 5.38), and hence 
all (scientific) things in M are determined by their values on &. In PDE language, this formalizes 
the notion of Laplacian determinism,” which in fact goes back (at least) to Leibniz: 


One sees then that everything proceeds mathematically - that is, infallibly - in the whole 
wide world, so that if someone could have sufficient insight into the inner parts of things, 
and in addition has remembrance and intelligence enough to consider all the circumstances 
and to take them into account, he would be a prophet and would see the future in the present 
as in a mirror.” (Leibniz, undated) 


An intelligence which could comprehend all the forces that set nature in motion, and all 
positions of all items of which nature is composed-an intelligence sufficiently vast to submit 
these data to analysis—it would embrace in the same formula the movements of the greatest 
bodies in the universe and those of the lightest atom; for it, nothing would be uncertain and 
the future, as well as the past, would be present to its eyes.“ (Laplace, 1814) 


From Penrose’s point of view, if a space-time (M,g) fails to be globally hyperbolic, then, taking 
some wannabe Cauchy surface %, neither the part M\D(%) # 0 of space-time behind the Cauchy 
horizon of itself, nor things happening behind this horizon, are determined by initial data on È. 


535]n the wake of Penrose (1968), see Simpson & Penrose (1973), McNamara (1978), Hiscock (1981), down 
to Chesler, Narayan & Curiel (2020). Mathematically rigorous work started with Dafermos (2003); more recent 
papers may be traced back from Van de Moortel (2020). The conclusion seems to be that Cauchy horizons turn into 
so-called weak null singularities, which are null boundaries with C? metric but i not locally L? (Luk & Sbierski, 
2016; Luk, 2017). At least for one-ended asymptotically flat initial data, behind such a weak mull singularity there 
is also a strong curvature singularity at r = 0. See also Luk & Oh (2019ab) for the two-ended case, and Gajic & 
Luk (2017) for extremal Reissner—Nordstrém black holes (i.e. |e| = m > 0). Such results concern cosmological 
constant A = 0. See Dias et al. (2018ab) and Luna et al. for A > 0 (specifically de Sitter space), whose verdict on 
strong cosmic censorship depends critically on both the matter coupling and the regularity of the extension. For 
A < 0 even weak cosmic censorship seems to fail (Hertog, Horowitz, & Maeda, 2004; Crisford & Santos, 2017). 

536Rarman (1995) is a classic, taken up among others by Doboszewski (2017, 2019, 2020) and Manchak (2020). 

537The classical exposition of this world view is Courant & Hilbert (1962), which unfortunately does not cover GR. 
In that light, see also Choquet-Bruhat (2009) and, specifically for matter fields, Bar, Ginoux, and Pfaffle (2007). 

538 Succinctly: ‘The world W is Laplacian deterministic just in case for any physically possible world W’, if W and 
W’ agree at any time, then they agree at all time.’ (Earman, 1986, p. 13). Hence we assume È to be spacelike. 

5392The undated German original is quoted by Cassirer (1936), pp. 19-20: ‘Hieraus sieht man nun, das alles 
mathematisch, d.i. uhnfehlbar zugehe in der ganzen weiten Welt, so gar, dass wenn einer eine genugsame Insicht in 
die inneren Teile der Dinge haben könnte, und dabei Gedächtnis und Verstand genug hätte, um alle Umstände vor 
zu nehmen und in Rechnung zu bringen, würde er ein Prophet sein, und in dem Gegenwärtigen das Zukünftige 
sehen, gleichsam als in einem Spiegel.’ English translation by the author. 

># Translation taken from the English edition from 1902, p. 4. Note that Leibniz’ prophet appeals to the logical 
structure of the universe that makes it deterministic, whereas Laplace’s intelligence knows (Newtonian) physics. 
Van Strien (2014) argues that Laplace also falls back on Leibniz and (uncharacteristically) gets the physics wrong 
by not mentioning the momenta that the intelligence should know, too, besides the forces and positions. 
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From the PDE point of view, although any MGHD (M’,g’) is globally hyperbolic with Cauchy 
surface & C M’, if (M’,g’) is extendible with extension (M,g), then È fails to be a Cauchy 
surface for M > M’ and we are back to Penrose’s perspective; this is true even if the extension 
(M,g) satisfies the Einstein equations. A difference between Penrose and PDE arises if the 
extension of “the” MGHD is not unique, which happens in some examples.”*! In that case the 
lack of apparent determination is even worse, but otherwise the analysis is not greatly affected. 

Either way, a lack of global hyperbolicity of (M,g) does not imply that space-time is 
indeterministic in the sense that random events occur in M\D(%), as in quantum mechanics.”"? 
The point is rather that events beyond the Cauchy horizon of & are not determined by the initial 
data originally expected to do so. This is quite remarkable, but nonetheless such events may 
instead be determined by signals coming from a (locally) naked singularity, or, should it turn into 
some kind of weak singularity itself, as mentioned above, by events happening on the Cauchy 
horizon. To further weaken the connection between global hyperbolicity and determinism, let 
us note that the undeniable indeterminism of someone falling into a black hole singularity is 
perfectly well compatible with global hyperbolicity, as the Schwarzschild solution shows. 

More generally, in classical (mathematical) physics indeterminism may come from either a 
lack of uniqueness of solutions or from a lack of existence thereof; the latter includes incomplete- 
ness of solutions, i.e. non-existence after (or before) some finite time. Strong cosmic censorship 
tries to secure uniqueness at the level of the Einstein equations (where existence is secure), but 
as we have seen its failure does not automatically imply indeterminism per se. In our view, lack 
of existence (e.g. for geodesic equations) is the more relevant source of indeterminism in GR.” 

We briefly return to weak cosmic censorship, where the connection between the Penrosian and 
the PDE versions is less clear than for strong cosmic censorship. Definition 10.13 is identical to 
clause 3 in Definition 10.2. Although both contexts use (future) null infinity 7 *, the connection 
with singularities/incompleteness, which indeed should be irrelevant to the concept of S7, 
seems missing in connection with weak cosmic censorship.”** Also, whereas Penrose’s version 
states that outgoing signals from a black hole singularity are blocked by an event horizon H£, 
the PDE version is about incoming signals: the further these are away from H+, the longer it 
takes them to enter H% , and in the limit at null infinity this takes infinitely long. Nonetheless, in 
§ 10.6 we shall see that in simple examples they match, because lack of global hyperbolicity of 
I~ (I) gives a wannabe Cauchy surface £ a Cauchy horizon which cuts off J+. 

In sum, there is no unique concept of weak or strong cosmic censorship. As a compromise, 


one might summarize the conjectures as follows. In “physically reasonable” space-times:°*° 


e weak cosmic censorship postulates the appearance and stability of event horizons;”"° 


e strong cosmic censorship requires the instability and disappearance of Cauchy horizons. 


541 This is especially true in the cosmological case. Examples include Misner space-time, Taub-NUT space-time, 
polarized Gowdy space-times, etc. See Earman (1995), Ringström (2009, 2010), and Doboszewski (2017). 

5# The proof of this shocking claim, going back to Born (1926), is in fact very recent (Landsman, 2020, 2021). 

543 In non-relativistic mechanics bodies may disappear to infinity in finite time (Xia, 1992; Saari & Xia, 1995), 
and hence, by the same (time-reversed) token, may appear from nowhere in finite time and hence influence affairs 
in a way unforeseeable from any Cauchy surface. This analogy with GR is discussed by Earman (2007), §3.6. 

54 After a talk by the author on June 16, 2021, Mihalis Dafermos pointed out that PDE-weak cosmic censorship 
should be seen as “weak weak cosmic censorship”, which is something like a test case for Definition 10.8.2. Yet 
neither Penrosian weak cosmic censorship nor PDE-strong cosmic censorship implies PDE-weak cosmic censorship. 

545Penrose’s “physically reasonable” is preferable to the mathematicians’ “generic”, since the so-called fine-tuning 
problem suggests that our cosmos is not al all generic! See Landsman (2016) and Adams (2019) for introductions. 

546 Whenever, of course, these are expected, viz. when trapped surfaces form in gravitational collapse (Joshi, 2007). 
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10.6 Cosmic censorship in some simple examples 


In this section we analyze the relationship between the Penrosian and the PDE versions of the 


cosmic censorship from three key black hole examples and their Penrose diagrams:*’ 


e Maximally extended Schwarzschild (i.e. Kruskal) with m > 0 (and two-sided initial data); 


e Schwarzschild with m < 0, whose singularities and horizons looks like supercharged 
Reissner-Nordström (|e| > m > 0), or ultrafast rotating Kerr (|a| > m > 0); 


e Reissner-Nordström with 0 < |e| < m, which also resembles Kerr with 0 < |a| < m. 


In the first case the solution coincides with the MGHD of the pertinent (two-ended) initial data, 
so the difference between strong Penrosian and strong PDE cosmic censorship fades. We have 
already drawn the Penrose diagram of the maximally extended Schwarzschild solution with 
m > 0 in §10.3. The maximal Cauchy development of a generic two-sided Cauchy surface & 
with suitable initial data (drawn as a horizontal blue line) is simply the entire space-time. In 
particular, the Cauchy horizon He of & is empty. The upper two green lines form the future 
event horizon H A of the black hole area, which is the upside-down upper triangle (labeled region 
II), whereas the lower two green lines form the past event horizon Hp of the white hole area, 
i.e. the lower triangle (region IV). The right-hand diamond is region I, the left-hand diamond is 
region III. Fd causal curves cannot leave region II and they cannot enter IV. 

Both cosmic censorship conjectures hold in both versions (i.e. Penrose and PDE 


„548 
): 
e Weak cosmic censorship for Kruskal space-time. 


Penrose: % is a Cauchy surface for 1 (.%*), making it globally hyperbolic.”"” 


PDE: each component of .7* ends at timelike infinity i+ and hence its lightlike 
geodesics are future complete (as confirmed by parametrization and computation). 


e Strong cosmic censorship for Kruskal space-time. 


Penrose: Kruskal space-time is globally hyperbolic (since the causal structure of the 
diagram is such that the line & represents a Cauchy surface). 


PDE: Explicit classification of the causal geodesics in Kruskal space-time (Mx, gx) 
shows that the antecedent of the second (“or”) part of Proposition 6.2 is satisfied: a 
causal geodesic is incomplete iff it crashes into the singularity at r = 0, in which case 
it has unbounded curvature because of (9.18). Otherwise, it goes to infinity, in which 
case it is complete. Hence Kruskal space-time is inextendible (cf. footnote 532). 


However, for m < 0 Kruskal, Reissner-Nordström, and Kerr, differences arise between the 
Penrosian and the PDE perspectives, since in these cases the maximal (analytic) solutions differ 
from the MGHD of the pertinent initial data. In particular, although (curvature) singularities 
are not part of space-time in any case, they can at least be drawn as boundaries in the maximal 
solutions, where they lie behind a Cauchy horizon. But precisely for that reason singularities are 
beyond the scope of the corresponding MGHD. Here are the Penrose diagrams: 


54 This section is largely based on Hawking & Ellis (1973), pages 158 and 160, as well as on Dafermos & 
Rodnianski (2008) and Dafermos (2014ab, 2017, 2019) for the PDE side. 

548]n view of the recently proved stability of Schwarzschild space-time (Dafermos et al., 2021), they also hold in 
the informal version stated at the end of the previous section. So far, such a proof is lacking for the other cases. 

549 Alternatively: any incomplete future inextendible timelike curve c must crash in the upper r = 0 singularity. 
Hence I~ (c) lies partly in region II, which is disjoint from J~ (.4*), so that I” (c) ZI" (x) for all x € I" (.I*). 
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Left: Penrose diagram of m < 0 Schwarzschild, or supercharged Reissner-Nordström (|e| >m>0), 
or fast Kerr (|a| >m>0). These solutions have a singularity at r = 0, but unlike the m > 0 Kruskal 
case it is not shielded by an event horizon. The red lines labeled He and He are past and future Cauchy 
horizons with respect to the blue line, which indicates a maximal spacelike surface whose initial data give 
rise to the metrics in question and whose maximal Cauchy development (MGHD) is the grey area. 


Right: Penrose diagram of subcritical Reissner-Nordström (0 < |e| < m), whose event and Cauchy 
horizons (despite the different structure of the singularity) also resemble those of slowly rotating Kerr 
(0 < |a| < m). The maximal Cauchy development (MGHD) of the pertinent initial data given on the 
maximal spacelike hypersurface represented by the blue line labeled X is again colored in grey. It contains 
past and future event horizons labeled H, and H, T, drawn in green, but unlike the m > 0 Schwarzschild 
case, the singularity they are supposed to shield cannot be reached directly from the maximal Cauchy 


development, which is bounded by the various fictitious boundaries .4*, i*, and i?, which lie at infinity, 
50 


as well as by the Cauchy horizons H~, drawn in red, which can be reached in finite proper time.” 


Despite the different space-times they apply to, the outcomes of the Penrosian version and the 
PDE version of both weak and strong cosmic censorship are once again the same:>>! 


550 This diagram can be infinitely extended in both directions (Hawking & Ellis, 1973, pp. 158, 165): to the north, 
another grey area folds inside the upper two red line segments, and similarly to the south, but we do not do so here. 

55!For m < 0 Kruskal the initial data are not complete in this case, so strictly speaking the cosmic censorship 
conjectures do not apply here. Nonetheless, they can be stated and the comparison is instructive. 
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e m < 0 Kruskal (etc.): For the Penrosian total space-time the difference between weak 
and strong cosmic censorship evaporates, since /"!(.7*) = M, which is not globally 
hyperbolic: wherever one tries to place a wannabe Cauchy surface & (such as the blue 
line), above the surface inextendible causal curves can be drawn that enter i* or .%* in the 
future and enter the singularity at r = 0 in the past, without crossing X. Similarly, below 
% one may draw inextendible causal curves converging to the singularity in the future, and 
to i or J` in the past, which once again do not cross &. Thus neither weak nor strong 
cosmic censorship holds for this space-time. 


The PDE picture applies to the grey area, which is the MGHD of the initial data given on the 
blue line marked & in the left-hand Penrose diagram. Then weak cosmic censorship fails 
because future null infinity .7 * is clearly incomplete: lightlike geodesics terminate at the 
Cauchy horizon (where they “fall off’ space-time) and hence are incomplete. On the other 
hand, strong cosmic censorship fails because the grey space-time, though globally hyper- 
bolic (in contrast with the entire space as we have just seen), is evidently (smoothly-even 
analytically) extendible, namely by the total space displayed. Though they do not coincide, 
we see that strong and weak cosmic censorship are closely related: future incompleteness 
of lightlike geodesics at null infinity happens because the MGHD is extendible. 


e Subcritical Reissner-Nordström (0 < |e| < m): for both Penrose and PDE strong cosmic 
censorship fails, whereas the weak version holds. In the Penrosian version the total space 
fails to be globally hyperbolic because of the part above the grey area (i.e. beyond the future 
Cauchy horizon HÈ ): one has past-directed inextendible causal curves that (backwards 
in time) end up in the singularity and hence never cross & (e.g. those crossing the upper 
left, NW-pointing red line from N to SW). Weak cosmic censorship holds because of the 
future event horizon H# , which shields the upper r = 0 singularity above it. Equivalently, 
I~ (.F*) is globally hyperbolic, a property it inherits from the MGHD.>»” 


The PDE view is cleaner here: roughly speaking, as in the m > 0 Kruskal or Schwarzschild 
case (but unlike the m < 0 case) future null infinity .7* ends at future timelike infinity i* 
and hence is complete, so that weak cosmic censorship holds.’ Strong cosmic censorship, 
on the other hand, fails because the MGHD (marked in grey) is clearly smoothly extendible, 
namely into, for example, the space-time shown. 


If the strong Penrosian conjecture fails for some space-time (Mp, gp), then its lack of global 
hyperbolicity typically occurs because (Mp, gp) is an extension of the MGHD (M,g) of some 
given initial data, whose Cauchy surface & fails to be one for (Mp, gp). Similarly, if I ml (I +) is 
not globally hyperbolic (so that there is a naked singularity), Mp usually comes from extending 
some (M, g), as above, whose Cauchy surface becomes a wannabe Cauchy surface in Mp, with 
an associated future Cauchy horizon that cuts off ./+* NM, causing its incompleteness.’ 

As already mentioned, the fact that these well-known examples violate (at least) strong 
cosmic censorship makes it all the more remarkable that the ensuing conjecture was made in the 
first place. In order to save it, such examples must be shown to be “non-generic”, for example 
through the blueshift instability mentioned in the previous section, or some other mechanism. 


552 This is no longer true for the maximal extension, which adds countably many components of .?*. Keeping the 
single & shown would allow many causal curves in J~ (J+) not hitting it, but adding countably many copies of £ in 
the obvious way would allow causal curves hitting this total & many times, which then cannot be a Cauchy surface. 

553]t even holds in the maximal extension, driving the Penrose and PDE versions apart! 

554However, these aren’t rigorous deductions: there are pathological cases where strong cosmic censorship holds 
whilst the weak version fails. See e.g. the Penrose diagram at the end of §2.6.2 of Dafermos & Rodnianski (2008). 
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10.7 Structure of event horizons and Cauchy horizons 


Most of the physics of black holes, including cosmic censorship, is concerned with various kinds 
of horizons. Three important types of black hole horizons one needs to be familiar with are: 


e Event horizons, defined in (10.79) based on Penrose’s concept of null infinity Z, i.e. 


HÈ = al*(.g*); (10.85) 


e Cauchy horizons, defined in (5.178) - (5.182), applied to wannabe Cauchy surfaces S, i.e. 


HE (S) = aD*(S)\S. (10.86) 


e Killing horizons (for stationary black holes), still to be defined, see §10.8. 


These are all null hypersurfaces, as we will now prove for the first two cases (for the third it will 
be true by definition). For convenience, let us recall some relevant definitions from chapter 5: 


Definition 10.14 e A subset S CM is acausal if no causal curve starts and ends at S. 
e A subset S C M is achronal if no timelike curve starts and ends at S. 


e The edge of an achronal set S consists of all x€ M for which every open nbhd U of x 
contains points y and z and two timelike curves from y to z, of which just one intersects S. 


A future/past set set F C M satisfies It/~(F) C F (if F is open this implies I> (F) = F). 


An achronal boundary is a set OF where F is a future set.°>° 


The domain of dependence/influence D+ /~ (S) of S C M is the set of all x € M for which 
every past/future-inextendible pd/fd causal curve starting from x intersects S. 


e The domain of dependence of S is the union D(S) = Dt (S) U D- (S). 

e A wannabe Cauchy surface is an acausal edgeless (and hence closed) subset of M. 

e The future/past Cauchy horizon of a wannabe Cauchy surface S is given by (10.86). 
e The Cauchy horizon of a wannabe Cauchy surface S is Hc(S) = 0D(S). 


e A Cauchy surface is a wannabe Cauchy surface S for which D(S) = M, i.e. Hc(S) = 0. 


Further to Lemma 5.37, we collect some of the properties of such sets, without proof:?>/ 


555 Apparent horizons are briefly discussed in § 10.11. See also footnote 508. 

556For F = I+ (A), below (5.146) we already showed that an achronal boundary is indeed achronal. In general, 
if y € I+ (x) for x,y € OF, then y € It (F) = I+ (F). But this is open, so y € int(F) whilst y € OF, which is a 
contradiction. Conversely, a maximal achronal set is an achronal boundary, and since any achronal set is contained 
in a maximal one, any achronal set is contained in an achronal boundary. See Minguzzi (2019), Theorem 2.87. 

557No. 1 is Proposition 2.136 in Minguzzi (2019), and the case F = J+ (A) is Claim 2 on page 12 of Galloway 
(2014). No. 2 is trivial, since J+ (Z7 (A)) = I (A) and J* (A) is open. No. 3 is Proposition 3.15 in Penrose (1972). 
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Lemma 10.15 1. Achronal boundaries are edgeless. 
2. Any F =I*+/~(A) is an open future/past set, for arbitrary A C M. 


3. Given an achronal boundary B = OF, where F is a future set, there is a unique disjoint 
decomposition M = PUBUF, where P is a past set and B = OP (and likewise F > P).”® 


It follows that both event horizons and Cauchy horizons of wannabe Cauchy surfaces are closed 
edgeless achronal topological hypersurfaces. But so are spacelike Cauchy surfaces in globally 
hyperbolic space-times, so our horizons must have special features that make them contain 
sufficiently many causal curves so as to become lightlike according to Corollary 5.15. These 
features are different, since although according to (10.85) - (10.86) both horizons are (part of) 
topological boundaries, future or past sets are very different from domains of dependence. Yet 
the second case will be reduced to the first! The key proposition for this is as follows: 


Proposition 10.16 7. Let S C M be a closed subset of M with associated achronal boundary 
B= RS: (10.87) 


If x € B\S, there exists a fd lightlike geodesic y that is contained in B with future endpoint 
x and either a past endpoint on S or no past endpoint at all (i.e. y is past inextendible). 


2. Let S CM be a closed achronal subset of M with associated future Cauchy horizon HÈ (5). 


If x € S\edge(S), there exists a fd lightlike geodesic y that is contained in H¢ (S) with 


future endpoint x and either a past endpoint on edge(S) or no past endpoint at all (ibid.). 


For an example of the first case of part 1, take S = {0}, where 0 is the origin in M. Then B 
is the closed forward lightcone emanating from the origin, which includes the origin. For the 
second case, consider the left-hand figure in the next section $10.8, and take S to be the left-most 
accelerated curve in region I. Then B is the entire SE-NW axis (i.e. x = —t) and so no pd lightlike 
geodesic ever touches S. Both cases of part 2 can be covered by a single example, see next page. 

A similar result holds with past and future interchanged. If we add this, and in part | take 
S = JF, noting that the boundaries .7* are closed in M (as i(M) C M is open by construction), 
we obtain a result about event horizons. If in part 2 we take S to be a wannabe Cauchy surface 
and note that edge(S) is empty in that case, we have a result about Cauchy horizons. Thus: 


Corollary 10.17 J. Let H a /— be the future/past event horizon of a black/white hole. Then 
any x € H a /— lies ona future/past intextendible lightlike geodesic contained in H as Zu 


2 He / ~(S) is the future/past Cauchy horizon of a wannabe Cauchy surface S, any 


x € He! (S) lies on a past/future intextendible lightlike geodesic contained in H^ (S). 


558Penrose (1972), Remark 3.16, warns that although in Minkowski space one has F = J+ (B) and P = I~ (B) this 
need not be true in general, with a specific counterexample already in (0,1) x IR C M2 with Minkowski metric. 

559 Both results are are due to Penrose (1972), Theorems 3.20 and 5.12, though neither is stated in the context of 
black holes! The first one may also be found in e.g. Wald (1984), Theorem 8.1.6, Galloway (2014), Proposition 
3.4, and Minguzzi (2019), Lemma 2.89 and Corollary 2.92. The second is Wald (1984), Theorem 8.3.5, Galloway 
(2014), Proposition 5.3, and Minguzzi (2019), Theorem 3.24. Each author uses a slightly different version of the 
curve limit lemma. Perhaps Penrose’s original proofs are now seen as heuristic, but in our view they are very clear. 
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A closed achronal subset S of 2d Minkowski space-time, drawn in blue, starts at edge(S) on the right, and 
then, always staying spacelike, asymptotes to the left-hand side of the backward lightcone off the origin. 
Its domain of dependence D? (S) is drawn in red and its future Cauchy horizon HÈ (S) consists of the two 
black lines, of which the right one ends at and includes edge(S), whilst the left one goes on downward 
forever along the lightcone. Past-directed lightlike geodesics within the right-hand branch of HÈ (S) end 


at edge(S) (after which they may eave He (S)), whereas those on the left are past inextendible.° 


Thus A p /= is ruled by future/past intextendible lightlike geodesics (called the generators) 
of H’ ~),5®! and similarly HÈ’ (S) is ruled by past/future intextendible lightlike geodesics. 
Hence both event horizons Hz; and Cauchy horizons He (S) are (topological) null hypersurfaces. 
We now prove case 1 of Proposition 10.16 for the special case S = {y}, so that B = OJ (y). 
This proof contains the idea of the general case. So take x € 9/*(y), then by definition there is a 
sequence (xn) in /*(y) converging to x, and there are pd timelike curves Y, from x, to y, with 


Yn: [0, bn] > M; YAO) = xr; Vn( by) =y, (10.88) 
parametrized by h-arc length, cf. §5.6. By the curve limit lemma 5.26, there is a limit curve 
y: [0,b] > M; W0) =x; y(b) =y, (10.89) 


where b, — b. This limit curve is causal and, coming from curves % in J* (y) as a uniform limit, 
it lies in the closure 7+ (y), which consists of ðI” (y) and its boundary B. If y contained any 
point z € I (y), then x € 7+ (y) by Proposition 5.4.5, but 7+ (y) is open and x € 9/* (y), so this 
is impossible. Hence y must lie entirely in the achronal set B, so that by Corollary 5.15 it must 
be a lightlike (pre)geodesic (which after reparametrization, if necessary, becomes a geodesic). 
Finally, if y has a past endpoint w in B different from y, then the above construction could be 
repeated with x in the role of w, duly extending y. See footnote 563 for more information. 

For the case of a general closed set S, we must replace the third entry in (10.88) by 


Oi) = Yn; (yn €S). (10.90) 


560Figure redrawn (and adapted) from Penrose (1972), Fig. 36, by Edith de Jong. 

561 These lightlike geodesics are intextendible in M; by Proposition 10.16 they may have future/past endpoints in 
I +/7, as well as past/future endpoints in H Ag /= itself, before/after which they leave the horizon. This leaves the 
possibility that lightlike geodesic on H7 hit a singularity. In the proof of Hawking’s Area theorem (see § 10.12) this 


is excluded by some version of weak cosmic censorship. See also Wald (1994), $6.1. 
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If the sequence (y„) has an accumulation point y the proof is almost the same as above, since the 
limit curve satisfies (10.89). If not, we use a trick:°© take a (geodesically) convex nbhd U of x 
with compact closure and hence compact boundary ON, and take the intersections Zn = MNU. 
By compactness of ON the z, have an accumulation point z. We now restart the argument with 
Yn replaced by zn, which works because zn € I” (x). This gives a causal limit curve from x to z, 
which by the above reasoning must lie in B and must be a lightlike geodesic within B, etc.” 
We now prove part 2 of Proposition 10.16 by reduction to case 1, which is possible because 
H¢ turns out to part of an achronal boundary.”°* Indeed, if, for HZ (S) to be concrete, we define 


W := I" (HE (S)) = I+ (S)\D+ (S), (10.91) 


where either side could be taken as the definition and the other as an inference, we have 


HÈ (S) = W NDH (S); AW = Hg (S) Val" (S)\S, (10.92) 


and similarly for the past Cauchy horizon He (S). This is easily proved, and verified in the 
picture above. Let us also give two examples where S is edgeless (see next page): 


e The upper picture is 2d Minkowski space-time with (1,1) deleted, and the x-axis is taken 
to be our wannabe Cauchy surface S (which by definition is acausal and edgeless). 


e In the Quinten space-time M35 (i.e. Mz with the closed horizontal line segment from 
(t,x) = (2,—1) to (2,1) removed), our wannabe Cauchy surface S is again the x-axis. 
Unlike the previous example, where OW = HÈ (S), it illustrates the full scope of (10.92). 


The proof of case 2 is now virtually the same as for case 1: take x € HÈ (S), seen as a 
specific component of the boundary of W. Since W = I+ (H¢ (S)) there is a sequence (xn) in 
It (HÈ (S)) converging to x, for each x, there is y„ € HÈ (S) with x, € I" (Yn), and hence there 
are pd timelike curves Y, from xn to yn, whose limit curve is the desired lightlike geodesic in 
HE (S). Since it is enough for Corollary 10.17.2, we assume edge(S) = ®, in which case the 
above construction can be repeated, so that this geodesic has no past endpoint in H£ (S). 


Corollary 10.18 If both the null curvature condition (see Theorem 6.15) and weak cosmic 
censorship hold (in that I~ (.9 *) is globally hyperbolic, cf. Definition 10.8 and Theorem 10.9), 
then any future trapped surface S must lie entirely within the black hole region B*. 


We just sketch the proof (by contradiction).° If S were to (partly) lie in J~ (.7*), then also part 
of I+ (S) lies in J~(.4*) and hence some of the lightlike geodesics y in Proposition 10.16.1, 
with past endpoint on S, would reach .7* and hence have infinite length. But the definition of a 
trapped surface excludes this, as in the proof of Penrose’s singularity theorem (see §6.4). 


56? Penrose (1972), p. 24. Other proofs are in Wald (1984), Theorem 8.1.6 and Galloway (2014), Proposition 3.4. 

563 Lemma 5.26 assumes that (M, g) is globally hyperbolic, but this assumption is not necessary here: the second 
bullet point in Lemma 5.40, which is case (ii) in Theorem 2.53 in Minguzzi (2019), can be excluded because we now 
have the sharper assumption Y,(b„) = y for all n (as opposed to %,(b„) — y in Lemma 5.26), so that Proposition 
5.21 makes b finite. Alternatively, one can use Chrusciel’s limit curve lemma mentioned in footnote 223. Our 
argument is supposed to be a rigorous version of the corresponding proof in Geroch & Horowitz (1979), p. 234. 

564The argument is due to Penrose (1972), proof of Theorem 5.12, pp. 44-45. For a different, very detailed proof 
see Minguzzi (2019), Theorem 3.24, compared to which the argument we give should be seen as heuristic. 

565 See Proposition 9.2.1 in Hawking & Ellis (1973), originally by Hawking (1972), corrected by Claudel (2000). 
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tes) F ES) 


D*CS) D*(S) 


2 *CS)| =S 


2d Minkowski space-time with (1,1) deleted, and the x-axis as a wannabe Cauchy surface S. Then D? (S) 
is the region between the x-axis and the two 45° lines, so that W is the shaded green area above these 
lines, which is excluded from D* (S) because any causal curve in W can disappear into the “singularity” 
(1,1) instead of reaching S. Furthermore, I* (S) is the upper half plane without (1,1). Thus OI*(S) = S, 
and OW = Hg (S) consists of the two 45 ° lines emanating from the deleted point (1,1). 


w. a 


IN DECS) 


$ aI*| CS) S r 


Quinten space-time, where the closed red line segment from (t,x) = (2,—1) to (2,1) is deleted from 2d 
Minkowski space-time, with wannabe Cauchy surface S again taken to be the x-axis. Then I* (S) is the 
upper half plane minus the dashed blue triangle with vertices (2,—1), (2,1), and (3,0), including its 
interior, and D? (S)\S is the open region enclosed between the x-axis and the two 45° lines connected by 
the red line (not included in D+ (S)). Furthermore, HÈ (S) consists of these 45° lines. Next, OI* (S)\S 
consists of the blue upper sides of the triangle, and finally W is the region above the zig-zag pattern 
formed by HÈ (S) and ðI* (S), with boundary OW as in (10.92). Figures by Edith de Jong-de Liefste. 
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10.8 Killing horizons and surface gravity 


We turn to the third type of black hole horizon, which is important for the development of black 
hole thermodynamics. Unlike the previous two it is only defined if the metric is stationary.°°® 


Definition 10.19 A Killing horizon in a space-time (M,g) is a connected null hypersurface 
Hx CM with anormal vector field N (which by definition is lightlike on Hx) that can be extended 
to a Killing vector field X on some nbhd of Hx, in which nbhd it is lightlike solely on Hx. 
Equivalently, a Killing horizon for a Killing vector field X defined on some open subset U C M 
is a connected hypersurface Hx C U that coincides with a connected component of the subset of 
U where X is lightlike (and hence nonzero), and X is normal (and hence tangent) to Hx. 


For example, in Schwarzschild black holes X = o; is timelike outside the hole, lightlike on the 
event horizon, making this also a Killing horizon, and spacelike after crossing the horizon inwards 
(see below for Kerr and Reissner-Nordström). This situation is surprisingly well mimicked by 


X =x0; + td, (10.93) 


in 2d Minkowski space-time (or indeed in any dimension). This is a boost generator and hence 
an isometry,>°’ whose flow is well known from special relativity and is given by 


t(s) = tocoshs + xo sinh s; x(s) = xocosh s + to sinhs, (10.94) 


where s € R (i.e. X is complete).°°® Some of the flow line are displayed in the left-hand figure. 


Left: Flow lines (black) and bifurcate Killing horizon (blue) of the vector field X = xd; + to; in 2d 
Minkowski space. Clearly, X is timelike in the regions I and III, spacelike in regions II and IV, and lightlike 
on the horizon, just like d, in the Kruskal case. The bifurcation surface is the origin. 

Right: Bifurcate Killing horizon of the same vector field X in 3d. The bifurcation surface is the y-axis 
(pointing out of the page). The bifurcation surface always has codimension 2 and e.g. for Kruskal is S?. 


566For more information see e.g. Chrusciel (2020), §4.3, Aretakis (2013), §5.6, and Poisson (2004), chapter 5. 

567Killing’s equation Xy.y +Xy;u = 0 reads Xu,v + Xv,u = 0, with X? =x, X! =t and hence Xp = —x, X| =t. 

568 This flow is not parametrized by proper time T. In region I, the Rindler wedge, putting to = 0 this is achieved 
by t(T) = xo sinh(t/xo) and x(t) = xo cosh(t/x9). This gives the well-known constant acceleration 1/x2. See e.g. 
Misner, Thorne, & Wheeler, §6.6. In this context the x > 0 part of the horizon is called the Rindler horizon: it 
represents the boundary of what can be (causally) known by the accelerating observers in region I (Rindler, 1956). 
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Since n(X,X) = t? —x’, the Killing horizons of X are the following lines (or hypersurfaces): 
x=t,x >Q; x=t,x < 0; x= —t,x > 0; x= -—t,x<0. (10.95) 


These combine into a cross |x| = |t| (without the origin in 2d and without the y-axis in 3d); the 
four (open) regions I, II, II, IV enclosed by the sides of the cross resemble a Kruskal diagram.”°” 
This cross has an interesting structure, which again is shared e.g. by Kruskal space-time: 


Definition 10.20 A bifurcate Killing horizon in a space-time is a union of four (connected) 
Killing horizons (for the same Killing vector field X) connected by a submanifold S of dimension 
two (generally of codimension two) on which X vanishes, called the bifurcation surface, and 
from which each of the four horizons emanates in a lightlike direction orthogonal to F. 


This implies that a bifurcation surface ./ C M is spacelike. Conversely, ./ determines a 
bifurcate Killing horizon, as follows.’ Suppose a Killing vector field X vanishes precisely 
on a two-dimensional spacelike submanifold ./. By Minkowski geometry (cf. the end of $4.6 
and §6.3), at each x € Z the tangent space T,M has a basis (L,L,e1,e2), where L and L are 
lightlike, preferably normalized as in (6.58), (eı,e2) is a basis of T-Z C T,M (so that e; and e2 
are spacelike), and L and L are orthogonal to e; and e2 (and hence to 7;.%). Since X = 0 on F, 
its flow y; leaves .Y pointwise invariant, so that its pushforward (W,). = T y, maps each tangent 
space 7;,M into itself (x € S). As X is a Killing vector field, each y, is an isometry of (M , g) 
and each 7, y, is an isometry of T,M. In particular, T,y,(L) must be lightlike, so that it must be 
proportional to either L or L. Since T, Wo = id and hence T,W(L) = L, proportionality to L is 
impossible by continuity in t. Hence there must be some function f such that 


Tey (L) = f(t)L. (10.96) 


Consider the geodesic ye for which ye (0) = x and y” (0) = L. In general, 


T) = WR); Y(t) = (sr), (10.97) 


where Y is any element of TyM and y is any isometry. The equation on the left follows because 
both sides are geodesics (this requires y to be an isometry, since arbitrary diffeomorphisms 
would not map geodesics to geodesics) with the same initial point y(x) and tangent vector yY 
at that point. Taking Y = L and y = y, the flow of X, then shows that, for the above f (t), 


pE) = W A)T), (10.98) 


so that w, maps y” to itself. This is only possible if X is proportional to Y, throughout y, 


which in turn implies that X is lightlike throughout y. Defining H E := C as in (6.61), that is, 
as the union of all fd lightlike geodesics emanating from .7 with tangent L (assuming that the 
above basis is defined smoothly all over .”), we obtain a Killing horizon. The same construction 
works with —L and with +L, yielding four Killing horizons Hy and Hy, which combine with 
F to form a bifurcate Killing horizon. Note that by the same arguments the space between these 
horizons is filled with geodesics whose tangents, still proportional to X, cannot be lightlike. 

Conversely, we will show that if the surface gravity K on some Killing horizon Hx, which 
we will now introduce, is strictly nonzero, then Hx extends to a bifurcate Killing horizon. 


569 There is a different way of looking at these Killing horizons, which has an analogue for black holes: if y is any 
of the curves in region I, then the x = t line equals 9/” (y). Similarly, x = —t equals JI~ (y) for any y in region III. 
570 What follows explicates an argument in Chruściel (2020), 84.3.2. 
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Proposition 10.21 Let Hx be a Killing horizon for some Killing vector field X. Then on Hk, 
VxX = KX, (10.99) 

for some function K defined on Hx, called the surface gravity of the horizon Hx. It satisfies 
Xk =0. (10.100) 


Eq. (10.100) means that K is constant along the null generators of Hx (i.e. the lightlike pre- 
geodesics with tangent X); in §10.12 we give conditions under which K is even constant on 
Hx (which is the zeroth law of black hole thermodynamics). Note that X is orthogonal but also 
tangent to the null hypersurface Hx (see §4.6), so that Xk is well defined even if kK is not defined 
outside Hx. Clearly, the flow of X on Hx is geodesic iff k = 0, in which case Hx is called 
degenerate. Likewise, a Killing horizon is non-degenerate if K is nonzero throughout Hx. 


Proof. Since g(X,X) = 0 on Hx one has Zg(X,X) = 0 for each Z tangent to Hx. Therefore, 
Zg(X,X) = (Vzg)(X,X) + g(VzX,X) +g(X,VzX) = 2g(VzX,X) =0. (10.101) 


Taking Y =X in (9.122) and using (10.101) gives g(VxX,Z) = 0 for each Z tangent to Hx, 
which implies that VxX must be normal to Hx and hence proportional to X. This proves (10.99). 
We derive (10.100) from an identity for any Killing vector field X and Y,Z € X(M), viz.>’! 


VyVzX - Vy,zX = Q(Y,X)Z := ([Vy, Vx] - Viy x])Z. (10.102) 
Putting W = VxX in torsion-freeness ZxW = VxW — VwX, and Y = Z = X in (10.102), gives 
Lx (VxX) =0. (10.103) 


Using (10.99), this gives x (KX) = 0, i.e. (XK)X +K£xX = (XK)X = 0, whence (10.100). 
We also give an equivalent but self-contained proof in coordinates, starting from the identity 


Viva" = Ry ak (10.104) 
valid for any Killing vector field X. Using (9.122), i.e. VuXy + VyXy = 0, and (4.13) gives 
VuVvXa = —VuVaXy = —VaVuXy + RygauX? = VaVvXy + RygauXP 
= VyVaXpu +RyupavX? + RypapX = -VvV uXa + RupavX? +RypapX? 
= —V pV yXa + RaguvX’ + RupavXP +RypopX?- (10.105) 


From (4.24) and (4.36) - (4.38) we then obtain (10.104). Furthermore, also for later, we have 
K? = WV XV Xa, (10.106) 
valid on the Killing horizon Hg. To derives this, use (8.94), which in coordinates reads 
(VyXu)Xo + (VuXp)Xv + (VoXv)Xu = 0, (10.107) 
contract with VX”, and use (10.99) and (3.74). Applying XV q to (10.106), eq. (10.104) gives 


2KX “yk = -RÝ gX"XP VX, = 0. 


57! For a proof see e.g. Aretakis (2013), page 87. Our subsequent coordinate proof follows Poisson (2004), 85.5.2. 
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As explained after (3.48), where we restrict the setting to Hx (which can be done since X is 
tangent to it), eq. (10.99) shows that the flow of X can be reparametrized to make it geodesic. 
Since X is lightlike on Hx by definition, the ensuing flow consists of lightlike geodesics. Hence 
the lightlike geodesics ruling the null hypersurface Hx according to Proposition 6.11 are 
reparametrized flow lines of X. Suppose X = fL for some function f defined at least on Hx, and 
La null vector field on Hx so that V;L = 0, i.e. its flow is geodesic. Then K = Lf, i.e. 


f(t) =xKt+c; = X(t) = (Kt+c)L, (10.108) 


along the geodesic flow T++ Yı(T) of L, where T is an affine parameter. If K 4 0, then X vanishes 
at T = —c/ K, which means that the Killing horizon has hit the bifurcation surface of a bifurcate 
Killing horizon, provided that y can be extended that far. We close with some examples.>’? 


e The surface gravity for the Killing vector field (10.93) in Minkowski space-time is given 
by K = +1 on the x = +t components of the Killing horizon. 


e In the Schwarzschild solution (9.15) in the original coordinates (t,r,@,@) the obvious 
Killing vector field is X = 0,, but since these coordinates do not apply exactly where 
things become interesting, namely at r = 2m, we switch to ingoing Eddington-Finkelstein 
coordinates (v,r, 0, o), with metric (9.44), and take (or: write) X = ð,, which is the same 
vector field (as a computation shows). The metric (9.44) shows that X is timelike for 
r > 2m, lightlike at r = 2m, and spacelike throughout 0 < r < 2m. In particular, the future 
event horizon H a defined in Theorem 9.1 is a Killing horizon, too. We may then compute 
K from its definition (10.99), which simply comes down to k = I'%,. This can be computed 
from (4.15) and (9.44), yielding I’, = m/r?, which at r = 2m gives K = 1/4m. 


e In the Kruskal solution (9.55), for the Killing vector field (9.63) one finds that k = 1/4m 
on the two SW-NE Killing horizons (including the future or black hole event horizon 
just treated), and k = —1/4m on the SE-NW ones (and hence in particular on the past or 
white hole event horizon). The bifurcation surface is the two-sphere at the origin. 


e The Reissner-Nordström metric (9.88) with 0 < |e| < m and X = d; = d, has two Killing 
horizons, which coincide with the inner and outer horizons H+ of Theorem 9.2. This 
follows from (9.94), which makes d, lightlike iff h(r) = 0, which is the case at r = r+. The 
surface gravities K+ coincide with those already labeled as such in (9.92) and (9.106). In 
the extremal case |e| = m > 0 the surface gravity on the single remaining Killing horizon 
= event horizon = Cauchy horizon vanishes. For |e| > m > 0 there is no horizon at all. 


e The Kerr metric (9.110) has a second Killing vector d,, apart from 0;, which again 
coincides with ð, as used in (9.137). This additional symmetry makes the choice of X 
ambiguous, but the Killing horizon of X coincide with the horizon H in Theorem 9.3 if 


X+ := A +0400, (10.109) 
see (9.143). With this choice of X, the surface gravity at AL = H pa is given by 


2 2 
O ře- 5, m? —a 
Pte ae pe m? + mV m2 — a? a. 


at least if 0 < |a| < m. The extremal case |a| = m > 0 has k = 0 (and vice versa), and in 
the ultrafast case |a| > m > 0 there is no horizon whatsoever, but a naked singularity. 


572The coincidence of Killing horizons and event horizons is no coincidence and will be taken up in $10.10. 
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10.9 Black hole uniqueness theorems: Static case 


Uniqueness theorems in GR, more specifically in the theory of black holes, typically refer to 
claims to the effect that under certain assumptions appropriate to the black hole setting, at least 
the space-time outside the event horizon must be (locally) isometric to one of the classical 
exact solutions, such as Schwarzschild, Reissner-Nordström, Kerr, or Kerr-Newman. Thus the 
uniqueness theorems formalize what (following Wheeler) is often called the “no hair” property: 


Perhaps the greatest surprise from the golden age [i.e. 1963-1975] was general relativity’s 
insistence that all properties of a black hole are precisely predictable from just three numbers: 
the hole’s mass, its rate of spin, and its electric charge.’ From those three numbers, if 
one is sufficiently clever at mathematics, one should be able to compute, for example, the 
shape of the hole’s horizon, the strength of its gravitational pull, the details of the swirl of 
space-time around it, and its frequencies of pulsation. (Thorne, 1994, p. 327) 


The scope of the black hole uniqueness theorems ranges from the early (misnamed) Birkhoff 
theorem from 1923, see below, to Penrose’s all-encompassing final state conjecture:>'* 


A body, or collection of bodies, collapses down to a size comparable to its Schwarzschild 
radius, after which a trapped surface can be found in the region surrounding the matter. Some 
way outside the trapped surface region is a surface which will ultimately be the absolute 
event horizon. But at present, this surface is still expanding somewhat. Its exact location 
is a complicated affair and it depends on how much more matter (or radiation) ultimately 
falls in. We assume only a finite amount falls in and that GIC is true. Then the expansion of 
the absolute event horizon gradually slows down to stationarity. Ultimately the field settles 
down to becoming a Kerr solution (in the vacuum case) or a Kerr-Newman solution (if a 
nonzero net charge is trapped in the “black hole”). (Penrose, 1969, pp. 1157-1158) 


Here GIC refers to what Penrose (1969) called the Generalized Israel Conjecture, i.e., 


if an absolute event horizon develops in an asymptotically flat space-time, then the solution 
exterior to this horizon approaches a Kerr-Newman solution asymptotically with time. 
(Penrose, 1969, pp. 1156) 


In the static case, which Israel himself proved (albeit under very restrictive assumptions) in 
two papers that launched the modern era,” ’> this means hat the solution exterior to this horizon 
equals the Reissner-Nordström solution (and hence the Schwarzschild solution in the vacuum 
case). This requires the inference of spherical symmetry from staticity, which is a (much more 
difficult) converse of the inference of staticity from spherical symmetry in Birkhoff’s theorem. 


573The latter seems zero in astrophysical reality (where nonetheless black holes in active galactic nuclei are 
surrounded by magnetic fields), unless °t Hooft’s idea that elementary particles are tiny black holes is viable. 

574*The conjecture is extremely open, in the sense that even a reasonable formulation is unknown.’ (Wong, 2009) 

575 These are Israel (1967, 1968). Overviews of the uniqueness theorem, including references to the original 
literature (some of which will also be cited below) include Hawking & Ellis (1973), §9.3, Carter (1986), Heusler 
(1996), and Chrusciel, Lopes Costa, & Heusler (2012). The history of the theorems is discussed in first-hand 
accounts by Israel (1987), Carter (2006), Thorne (1994), chapter 7, and Robinson (2009). Briefly, the “no hair” 
conjecture originated in the Moscow from work by Ginzburg on quasars and independently Doroshkevich, Novikov, 
and Zeldovich on deformations of black holes. In 1965 Novikov presented this work at the GR4 conference in 
London, through which it reached the West, where the idea was picked up by Wheeler and his former students like 
Thorne and Misner, by Israel, and subsequently, via the latter, by Carter, Hawking, and others. 
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Since a complete treatment of the uniqueness theorems would require an entire monograph, 
our aim here is just to generate some feeling for these theorems by discussing a few special cases 
in some detail, and even those, for clarity, under stronger assumptions than strictly needed, and 
with mere outlines of the main steps in the proofs (which would take pages per step if done in 
detail). Stronger up-to-date results will be mentioned along the way without proof. 


As already mentioned, the first uniqueness theorem for black holes was Birkhoff’s:>’° 


Theorem 10.22 Any spherically symmetric solution to the vacuum Einstein equations is locally 
isometric to the Schwarzschild solution (i.e. for all values of r > 0). 


The most remarkable aspect of this theorem is that spherical symmetry implies staticity, but even 
if staticity is assumed the conclusion would be non-trivial. Of course, everything is predicated 
on the exact definition of spherical symmetry. Using coordinates (x) where the rotation group 
SO(3) acts trivially on x’ =, and acts in the usual way on (x!,x?,x°), SO(3)-invariance forces 


gij = Abi + Bx'x!; 80: = Cx’; 800 = —D, (10.111) 


where A, B, C, and D depend on x? + x5 +.x3 and x’ = t only. Replacing (x!,x°,x°) by spherical 
coordinates (r,@,@) but redefining r so that the volume element of SŽ is r7dQ, cf. (9.17), gives 


g = -Edt” + 2F drdt + Gdr? +HdQ, (10.112) 
where E, F, G, and H depend on r and r only. A further coordinate transformation then yields 
g = I(u,t)du? +2J(u,r) drdt + K(u,r)dQ, (10.113) 


which the vacuum Einstein equations then force into the Schwarzschild metric (9.44) or (9.45) 
in Eddington-Finkelstein coordinates (the lengthy and dull computations are left to the reader). 
But the above concept of spherical symmetry was coordinate-dependent. We can do better: 


Definition 10.23 A space-time (M,g) is spherically symmetric if SO(3) is a nontrivial sub- 
group of the group Iso(M,g) of its isometries and the orbits of SO(3) are isometric to spacelike 
two-spheres S- of some radius r > 0 (endowed with the usual round metric Jaw 


The following-very technical-lemma will do much of the work. Part 1 may sound trivial given 
Definition 10.23, but its thrust lies in the precise meaning of a foliation (see footnote 578). 


576What is called Birkhoff’s theorem was also independently discovered by Jebsen (1921), Alexandrow (1923), 
and Eisland (1925). See Johansen & Ravndal (2006) and Ehlers & Krasiński (2006). The name-giving source is 
Birkhoff (1923), which is sometimes cited as Birkhoff & Langer (1923); the cover says ‘By George David Birkhoff, 
PhD, with the coorperation of Rudolph Ernest Langer, PhD’. Most GR textbooks contain computations supporting 
the theorem, e.g. Misner, Thorne, & Wheeler (1973), §23.2 and §32.2, is very clear. The Ansatz (10.111) is taken 
from Deser & Franklin (2005) and the subsequent (original) use of Eddington-Finkelstein coordinates is due to van 
Oosterhout (2019), which contains a detailed derivation of (9.44) or (9.45) from (10.113); this has the advantage 
of not being limited to r > 2m. Complete and rigorous geometric proofs are hard to find. We follow Hawking & 
Ellis (1973), Appendix B, which relies on Lemma 10.24 due to Schmidt (1967). See also van Oosterhout (2019). 
Birkhoff’s theorem was extended to electrovac space-times by Hoffmann (1932ab). 

57’ This excludes Minkowski space-time (IM, n), which near r = 0, is not foliated by two-spheres! Birkhoff’s 
theorem produces (IM, n) without the t-axis, which can be added by moving back to Cartesian coordinates. 
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Lemma 10.24 1. A spherically symmetric space-time (M,g) is foliated by two-spheres.°!® 
2. Each x € M has anbhd U, containing a 2d submanifold N, through x that intersects each 
orbit (i.e. two-sphere S?) overlapping with U, exactly once and does so orthogonally. 


3. For any two (nearby) orbits G and G' the map © — 0" that sends x € C to NAG" 
(provided this is nonemtpy, in which case it has one element) is a conformal diffeomorphism 
whose conformal factor Q is constant on Ö (i.e. depends on Ê and G' alone). 


Visualizable examples in n = 3 include R?\{0}, seen as Euclidean space (minus the origin) 
foliated by two-spheres (G = SO(3)), and R?\{x! = x? = 0}, seen as Minkowski space-time 
M; in d = 2 +1, foliated by circles in planes with x’ = constant (G = SO(2)). In the first 
example of the previous footnote .% is (locally) simply the radial line through x. In the second, 
it is (locally) the plane defined as the product of the radial line through x and the r-axis. 


Proof. If a Lie group G acts smoothly on M and dim(G,,) is constant, where 
Gx = {g €G| gx =x}, (10.114) 


is the stabilizer of x, then the associated vector fields defined by (8.246) define a foliation of M, 
whose leaves are the connected components of the G-orbits ©, = Gx. This is the situation here, 
with G = SO(3) and Gy & SO(2), and connected orbits  S?. This proves the first claim. 

For the second claim,°’? the more general fact is that there is such an „4, provided each 
y € G, (different from the identity) satisfies X = X iff X = 0, for X € Ty. This assumption 
certainly holds if the SO(3) orbits are all two-spheres, in which case Gy = SO(2) rotates 
TSZ = R? (note that since y(x) = x, the pushforward y, maps 7,M to itself). To prove 
this, define .% as the submanifold generated by all geodesics emanating from x with tangents 
orthogonal to the orbit Êy. The slice theorem for compact Lie group actions gives the required 
nbhd U, (where S$ = -%U, acts as the slice). We now show that if X | 7,6, and w € G,, then 
WX =X. o p = w, Æ X, then (because y is an isometry) Y L 7,.@,, and the different 


geodesics yy and yy ` both intersect some orbit © near x. But since y is an isometry, 


PONO vre O), (10.115) 


and so © would intersect ⁄% in more than one point, contradicting the slice theorem. Now take 


y= Ye (s) € Ux for some s # 0. By the same calculation, y(y) = y, so Gy C Gy and hence 


y.(Y) =Y for each Y | T,O, (where this time, y, : T,M — T,M). Now decompose 


9 (5) =+, (10.116) 
with Y; L 7,0, and Y € T,O,. We know that y.(Y;) = Yı, and w.(Ya) # Y2 would lead to a 


similar contradiction with the slice theorem as previously at x, so W,(Y2) = Y2 and hence Y = 0. 


578 A k-dimensional foliation of an n-dimensional manifold M, where 0 < k < n, may be defined as a subbundle 
E C TM of rank k that is involutive in the sense that if X,Y are sections of E, then so is their Lie bracket [X,Y]. In 
that case, M is the disjoint union of the leaves of the foliation, which are (immersed) connected submanifolds 2 
of M such that 7,4, = Ex at each x € M (where Y, is the leaf through x). The nontrivial fact we need is that, for 
a general foliation, each x € M has a nbhd U and associated chart (U, œ) with ensuing coordinates (x') such that 
Z NU is given by x! = constant, ..., x”-* = constant. See e.g. Guillemin & Sternberg (1984), §27. 


579The general case is due to Schmidt (1967), Theorem 1. For the slice theorem used in the proof below see e.g. 


Guillemin & Sternberg, 1984, Proposition 27.2. 
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(x) 


Thus yx” and hence A, intersects all orbits in U, orthogonally, >80 proving the second claim. 

Call the map in the third claim f : © — 0’; since the orbits are compact, f can indeed 
be defined on all of ©. This map is well defined by the previous claim and it is a local 
diffeomorphism by the properties of the exponential map. For any pe Gone has fog = go f by 
computations on geodesics as in the previous step. Now choose an orthonormal basis (€1,...,ex), 
with k = 2 in the case G = SO(3) and Gy = SO(2) at hand, obtained from a unit vector u € Ty 
bye; = y; u for suitable y; € Gy. Then all e; are unit vectors, and by G-equivariance (as just 
noted) these are mapped into an orthogonal basis u; = W;( fxe), which also consists of vectors of 
the same length. Linear conformal transformations are compositions of rotations, reflections, 
and dilations,°°! and hence f is conformal. Since G consists of isometries and acts transitively 
on each orbit, a simple computation shows that the conformal factor is constant on ©. 


By foliation theory, it is now possible to introduce coordinates (t,r,@,@) on U, such that: 
e each orbit @ is given by t = constant and r = constant; 
e each normal surface ^% is given by 0 = constant and @ = constant. 


Here (9,_) are spherical coordinates on S?. This yields (10.112), with which we close our dis- 
cussion of Birkhoff’s theorem; as already mentioned, we will not give the explicit computations 
that lead from (10.112) to the Schwarzschild metric (but see §9.2, which assumed staticity). 
Spherical symmetry is a very strong assumption, and so it is interesting to know that the 
Schwarzschild solution can also be inferred from a very different set of assumptions, which now 
includes staticity. The conclusion of the theorem is both global and restricted to the exterior 
region r > 2m, which requires the use of a manifold with boundary in the assumptions.>°” 


Theorem 10.25 Let (M,g) be a one-ended static asymptotically flat space-time (cf. Definition 
8.4) with metric (8.96), such that L > 0 in int(M) and L = 0 on ðM, where M = R x È and 
OM = R x 0%, with 0X compact. If the metric solves the vacuum Einstein equations (8.100) 
- (8.101) and (M, g) is maximally extended up to its boundary, then (M,g) is isometric to the 
exterior region r > 2m of Schwarzschild space-time (9.46) with metric (9.15) having m > 0. 

In particular, the boundary OM is connected and coincides with the future event horizon H as : 


Despite its strong assumptions, this theorem is quite remarkable, since it not only shows that 
the Schwarzschild metric is the only static vacuum black hole space-time (which by definition 
is asymptotically flat) with smooth event horizon, but it also shows that multiple black hole 
configurations (which would form a space-time with disconnected boundary and hence are 
excluded by the theorem) in vacuum cannot be static. In order to understand its assumptions, it 
is worth recalling that L? = —g(0,,0,), where 0, is the (usual) timelike Killing vector field of a 
static space-time, so that the vanishing of L at OM makes the latter a Killing horizon, which a 
posteriori is identified with the event horizon of a Schwarzschild black hole, cf. Theorem 9.1. 


(x) 


580 Since we now know that Yx (s) L T,0, we may run the geodesic (and the argument) the other way round, 
obtaining the inclusion Gy C Gx, and hence Gy = Gy. Since y (X ) = X for y € Gx and X L T,O,, we also have 


vi) (t)) = Ye (t), so that .% is pointwise invariant under G, (as the examples indeed illustrate). 

58! This is Liouville’s theorem, see e.g. Akivis & Goldberg, 1996, Theorem 1.1.1. 

582 This is a version of Israel’s theorem from 1967, see footnote 575, due to Bunting & Masood-ul-Alam (1987). 
Israel assumed the boundary 0M to be connected and also made several other superfluous assumptions. See also 
Heusler (1996), §9.2, Schoen (2009), Lecture 11, and the references in footnote 575 for historical context. 
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The proof (which we only sketch) aims at recovering the spatial Schwarzschild metric 
gs from certain characteristic properties, upon which (8.96) gives the space-time metric on 
M% = Rx XK. The spatial metric gs is defined on £% := R°\((0,2m] x S?) and is given by 


&s = L(r) 7dr? +r? (dO? + sin? 0dQ°); L(r) = V1l-2m/r, (10.117) 
Listed in the order of the chain of deduction, these characteristic properties of gs are as follows: 


1. Beyond the generic properties 3); = 6;; ++ O(1/r) and L = 1+ O(1/r) from Definition 
8.4, the specific asymptotics of g = gs, with goo = L? and $; j= Lg; j, also satisfy 


L=1-”4 (=) < rt (=) : (ro), (10.118) 
whereas Definition 8.4 only requires ;; — 6;; + O(1/r). In general, for our asymptotically 
flat spatial metric g, the asymptotics expressed by eqs. (10.118) follow from the Einstein 
equations (8.102), roughly speaking as follows. The second, AL = 0, where A is the 
Laplacian defined by $, has as lead term AL = 0, where A = 02 + oy a o? is the usual 
flat Laplacian. In 3d flat space Laplace’s equation is solved by 


L=C-m/r, (10.119) 


where m and C are constants. Then L — 1 forces C = 1 and the error term in passing from 
A to A gives the first entry in (10.118). Next, in terms of U = In(L), eqs. (8.102) read 


A 


Rij = 20;U ð};U; ÂU =0, (10.120) 


where R; j and A are the Ricci tensor and the Laplacian defined by ĝ, respectively. The 
asymptotics of L give U = O(1/r) and hence R;; = O(1/r*), as r — œ. In harmonic 


coordinates (where Ax! = 0, i.e. Tİ := arti = 0), this yields the second part of (10.118). 


2. The spatial metric &s is conformally flat.°°* This follows either from a reparametrization 


r=p(1+m/(2p))?: (10.121) 
s =(1+m/(2p))*(dp? + p*dQ), (10.122) 


or by computing the Cotton tensor (4.121) for the metric (10.117), which gives zero. 


We now prove that our Riemannian manifold with boundary (£, g), as defined through the 
assumptions in Theorem 10.25 plus Definition 8.4, must be conformally flat. To this end, 
we first perform a useful but partly unsuccessful manoeuvre. Rescale g to 


BEN 
g:= (=) 8. (10.123) 


583 See Beig (1980), Kennefick & Ö Murchadha (1995), or Schoen (2009), Lecture 11. One needs two identities, 
cf. Wiki’s List of formulas in Riemannian geometry. First, Rij =R,;- LVV.L + 2L-°V;LV jL -L AL: ij 
valid in d = 3. Using (8.102) and L = exp(U), this gives R;; = 29;U ð;U. Second, Af = e74 (Â — g'/0,U0;) f for 
any function f, also in d = 3, so taking f = L = exp(U ) and using (8.102) gives ÂU =0. 

584This was noted at least as early as Synge (1960), §VIII.4. We learnt it from Cederbaum (2019), Lecture 1. 


298 


Black holes Il: General theory 


The Riemannian manifold (&, g) has vanishing Ricci scalar and asymptotic mass, i.e. 


R=0; 11°(g) =0, (10.124) 
where TI? is defined by (8.103). The first property follows from a simple computation.**° 
The second follows from the asymptotics (10.118), noting that TI? is determined by the 
1/r term (cf. the computation of the Schwarzschild case in footnote 364). Since 


EIN 2 1 2 1 
ee: ee , (10.125) 
2 r r2 r r2 


the m/r terms cancel in the product g. We would now like to invoke the second part of 
the positive mass theorem 8.5 in order to infer that (X, 8) is isometric to Euclidean space 
(IR>, 5), so that (£, @) is conformally flat. But this does not work since È is not a manifold 
but a manifold with boundary, on top of which (and for that reason) it is not complete. 


To remedy this, we perform a trick.°*° First as a manifold, we form the “double” 
Lig = L Upg È, (10.126) 


with metric ĝ4 given as the original one g on both copies of % including their common 
boundary. The function L, though, is extended to a function Lg on Ug defined as L on one 
copy of & and as —L on the other; this can be done continuously (though not smoothly) 
since L = 0 on 0% (and also the metric gy is no longer smooth on the boundary). 


Now rescale ĝ4 through (10.123), leading to a Riemannian manifold (Xg, Za) without 
boundary. The end where L — 1 of course remains asymptotically flat, but because of the 
conformal transformation (10.123) the end where L— —1 can be compactified by adding 
a single point (which for the metric ë would have been a two-sphere at infinity).°°’ The 
ensuing Riemannian manifold (£4, ğa) is complete (basically since one end is asymptoti- 
cally flat and the other end has been compactified). The computations that imply (10.124) 
also work for (Èa, ča), which, then, satisfies the hypotheses of the second part of Theorem 
8.5 and hence is isometric to (RÌ, 5). Consequently, (£4, g4) is conformally flat, but since 
this is a local property we conclude that (£, g) is conformally flat. 


3. The spatial metric & 5 is spherically symmetric. This is clear for gs. For our general 
metric g (re)constructed so far, spherical symmetry follows from conformal flatness in the 
situation where the Ricci tensor is given by (8.100). The proof uses the fact that dL #0 
and that the level sets L = constant are (topologically) two-spheres, which in turn follows 
from from (10.118).°°* Up to the boundary where L = 0, the space È of Definition 8.4.1 


585For a conformal transformation g = 0?% we have R = Q? R —40 FAQ + 20-*84/0,00;. Taking Q = 
(1+L)?/4 and using (8.102) gives R = 0. 

586For any manifold M with boundary ðM, the manifold M Ugy M is topologically defined as (MUM)/ ~, where 
x~ y iff x,y E€ dM and x = y, with a unique smooth structure making this space a manifold (without boundary). 
More generally, given two manifolds with boundary Mı and M2 with a diffeomorophism f : 0M, — 0Mo, one 
may define Mı Uf M2 as (Mı UM2)/ ~, where x ~ y iff x € OM and y = f(x) € OM). The existence of a smooth 
structure on this quotient derives from the collar neighbourhood theorem, which states that for any manifold M 
with boundary 0M there is a neighbourhood U of 0M in M and an associated diffeomorphism y : U > aM x (0,1). 

587This requires much more detailed arguments, see Lemma 3 in Bunting & Masood-ul-Alam (1987). 

588This requires some elliptic PDE theory and Morse theory. See Theorem 1 in Künzle (1971), which goes back to 
Lichnerowicz (1955), §78: an asymptotically flat space is either flat (which corresponds to the case m = 0), or, if 
m #0, has dL £0 throughout, with level sets = S?. The maximum principle for elliptic PDEs (or the last claim of 
Theorem 8.5) gives the first claim, whereas the absence of points where dL = 0 makes all level sets homeomorphic 
to those near r — œ. From Definition 8.4 and the first entry in (10.118), these level sets are two-spheres. 
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is then foliated by the level sets of L, and one may set up a calculus à la (6.10) - (6.15), 
but in one dimension lower. Indicating this by the use of spatial indices, and writing 


W := V'LV iL, (10.127) 
a computation shows that the Cotton tensor squared is given by 
CikC = LW? (80;;0" +W 7h'0'Wajw), (10.128) 


where h;; = ĝij —njnj; is the projection onto the level sets, in terms of their normal 7. Thus 
conformal invariance, i.e. C; jg = 0, enforces oj; = 0 and h'/0;W = 0. This makes the level 
sets, already known to be two-spheres topologically, also two-spheres metrically (i.e. with 
the usual SO(3)-invariant metric Q), so that X is spherically symmetric.°*? 


We have shown (at least in outline) that the Riemannian manifold with boundary (È, 2) defined 
through the assumptions in Theorem 10.25 plus Definition 8.4 is spherically symmetric. Using 
Birkhoff’s theorem in the simple case where staticity has already been assumed, the spatial 
Schwarzschild metric (10.117) and then the full one (9.15) via (8.96) then follow. The case 
m < 0 is excluded by the assumptions in Theorem 10.25, since L has no zeros in that case. 


Short of Birkhoff’s, this is the simplest uniqueness theorem for black holes! Similar rea- 
soning shows that the subcritical Reissner-Nordström metric (i.e. 0 < |e| < m) is the unique 
(exterior) static “electrovac” black hole space-time with smooth non-degenerate event horizon 
and vanishing magnetic charge (where ‘non-degenerate’ means nonzero surface gravity).°”” 
However, the degenerate case (|e| = m > 0) is not unique! Consider the Reissner-Nordström 
metric (9.88) for this case, given by the usual static metric (8.96), now with spatial part 


Srna = L(r)~*dr* +r (d0? + sin? 04Q°); L(r)=1-m/r. (10.129) 
Compared with the Schwarschild metric (10.117), we have 1 — m/r instead of y1 — 2m/r. The 


coordinate transformation p = r — m and then p ~ r turns the total space-time metric into 


Bis 


yo? U? (dr? +rdQ), (10.130) 


&MP = 


where r > 0 and 
U:=1-+m/r. (10.131) 


Clearly, U solves the (flat space) Laplace equation 
AU =0, (10.132) 


The point is that (10.130) is a static solution to the Einstein—Maxwell equations for arbitrary 
(static) solutions to (10.132), provided the electromagnetic field potential is taken to be 


A=U~!dt. (10.133) 


As such, (10.130) is called a Majumdar-Papapetrou metric. For example, one may take 


N bs 
C= +) asp (10.134) 
i=1 4 2i 


589 See Theorem 2 in Künzle (1971) or Corollary 9.3 in Heusler (1996). 
59 See Israel, 1968, Masood-ul-Alam (1992), and Heusler (1996), 89.3, and further references in the latter. 
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where (J1,...,yv) =: Y is any finite set of points in R3, so that gmp is defined on 
Myp = R x R?\(5i,....¥y). (10.135) 


It turns out that the event horizon H A equals Y, and that e; is the charge at the puncture ya 
This leads to a generalization of Theorem 10.25, which like most uniqueness theorems 
describes the domain of outer communication (DOC). Apart from its axiomatic definition (10.83) 
in the presence of .%, this domain can also be defined for a stationary asymptotically flat 
space-time (M, g), with asymptotically timelike Killing vector X (= œ). Just calling it D, we put 


D := PT (M*"') nit (M™); M™ := Urcr@(L™), (10.136) 


where X*' is one of the ends of M as in Definition 8.4.1, and @, is the flow of X (assumed 
complete). For example, in Kruskal space-time with X°*' “to the right”, D corresponds to region 
I. Similarly, we may define black and white hole regions and their event horizons by 


B+ := M\I* (MU), (10.137) 
Hr =08- = 9 (M™). (10.138) 


2 . . é A 
592 which is a “Penrosian” 


593 


Recent uniqueness theorems assume that D is globally hyperbolic, 
version of weak cosmic censorship, see the end of $10.4. In the static case, we then have: 


Theorem 10.26 Let (M,g) be a static asymptotically flat electrovac space-time (i.e. solving the 
source-free Einstein-Maxwell equations) containing a connected acausal spacelike hypersurface 
È (cf. Definition 8.4.3) whose closure È is a topological manifold with boundary consisting (as 
a disjoint union) of a compact set K and finitely many ends Z&" (cf. Definition 8.4.1). If the DOC 
(D, g) is globally hyperbolic and 0% C M\D, then X can have only one asymptotic end, and: 


e If the event horizon H a defined in (10.138) is connected, then the DOC of the unique end 
Xx is isometric to the DOC of a Reissner-Nordström space-time with 0 < |e| < m #0. 


e If the event horizon H Al is not connected, then the DOC of X&* is isometric to the DOC of a 
Majumdar-Papapetrou space-time with N > 2. 


In particular, in the vacuum case one recovers the DOC of Schwarzschild space-time with m >. 


Here Reissner-Nordström includes e = 0, i.e., Schwarzschild with m > 0. The cases |e] >m > 0 
and e = 0 > mare excluded because they lack event horizons (and have D = M, which is not 
globally hyperbolic, cf. §10.6). The Minkowski case e = m = 0 has no event horizon either. 


591 See Chruściel (2020), $4.7. The charge is defined by — (Ar)! Js xF, where Se is some two-sphere around yj. 

5Global hyperbolicity of D is clearly a necessary condition for the theorems (as the space-times in their 
conclusions satisfy it). Galloway (1995) showed that global hyperbolicity of D plus the null energy condition (which 
is satisfied here) imply that D is is simply connected, which is needed for the proof of Theorem 10.26 (which we 
omit). Global hyperbolicity of D is also needed to underpin the heavy PDE analysis in all proofs (Chrusciel & 
Lopes Costa, 2008). The original assumption used in this context by Hawking and others in the 1970s was strong 
asymptotic (future) predictability, see footnote 638. 

593 Theorem 10.26 is Theorem 3.1 in Chrusciel, Lopes Costa, & Heusler (2012), which as the authors explain is 
the culmination of a long development starting with Israel’s theorem 10.25 (and we would say: with Birkhoff’s). 
The implication that & can then only have one asymptotically flat end is part of Theorem 3.3.1 in Chruściel (2020), 
originally due to Chrusciel & Wald (1994a). This follows from topological censorship, see footnote 601 in §10.10. 
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10.10 Black hole uniqueness theorems: Stationary case 


Passing from the static to the stationary case, a natural generalization of Theorem 10.26 would 
be that the domain of outer communication (D, g) is isometric to the DOC of a Kerr-Newman 
space-time (with 0 <a? + e? < m7), at least if the event horizon is connected.°** Unfortunately, 
this has only been proved under considerably stronger assumptions, namely:°”> 


1. Those of Theorem 10.26, of course replacing static by stationary;>?° 


2. Connectedness and non-degeneracy of the horizon H} (i.e. surface gravity K # 0).°?/ 


3. 9% C ADNI (D), such that O% intersects each null generator of ADNI (D) once.?”® 


4. Analyticity of the space-time metric g.°”” 


Theorem 10.27 Let (M,g) be a stationary asymptotically flat electrovac space-time satisfying 
1-4. Then (D,g) is isometric to the DOC of a Kerr-Newman space-time with 0 < a’ + e? < m?. 


Thus (M, g) is characterized by just three numbers (m,a,e) and hence this is the ultimate “no-hair 
theorem”. An important stepping stone from the static to the stationary case is Hawking’s rigidity 
theorem, which is very interesting by itself and explains the coincidence of event horizons and 
Killing horizons in the Schwarzschild, Reissner-Nordström, and Kerr metrics.©? 


Theorem 10.28 Under the assumptions of Theorem 10.27, either the asymptotically timelike 
Killing vector field X defining stationarity is tangent to the event horizon H}, or the isometry 
group of (M,g) contains R x U(1), where U(1) acts via spatial rotations, and there is another 
vector field Y that is a linear combination of X and the generator Og of the U(1) isometries, for 
which H a is a Killing horizon. Either way, the event horizon H E is also a Killing horizon. 


Although the first option describes the situation for the Schwarzschild metric, see Theorem 9.1, 
and the second the one for the Kerr metric, see Theorem 9.3 and eq. (9.143), in general it seems 
quite mysterious where the axial symmetry should come from. This much we will explain. A 
key lemma for Hawking’s rigidity theorem and a very important result in its own right is:®°! 


5°4There are various candidates for space-times with multiple rotating black holes generalizing the Majumdar- 
Papapetrou metrics to the stationary case, none of which are well understood. See e.g. Weinstein (1996) as well as 
numerous physics papers, partly reviewed in Chrusciel, Lopes Costa, & Heusler (2012), §3.2.2. 

595 Theorem 10.27 is due to Chruściel & Lopes Costa (2008); see also Lopes Costa (2010). There are many other 
stationary axisymmetric solutions (Stephani er al., 2003, chapters 19-21) but these are either not asymptotically flat 
or have non-globally hyperbolic DOC and hence lack an event horizon and fail weak cosmic censorship. 

5°6Completeness of the Killing field X in charge of stationarity is included as part of the definition of the latter. 

597One may drop non-degeneracy at the expense of assuming U (1)-invariance (Chrusciel & Nguyen, 2010). 

>°8By (10.136) and Proposition 10.16.1, 9DNI+ (D) is ruled by lightlike geodesics. This assumption is technical 
and is explained in detail in Lopes Costa (2010). Roughly speaking, the specific cross-section of the horizon given 
by intersection with 0X must hit its null generators once. This is clearly the case for Kerr-Newman. 

599 This is the most undesirable hypothesis, required for Theorem 10.28. See also footnote 604. 

600 The original version is in Hawking (1972) and Hawking & Ellis (1973), Proposition 9.3.6. Theorem 10.28 is 
like Theorem 5.1 in Chrusciel (1996), based on Chrusciel (1997). See also Friedrich, Räcz, & Wald (1999). 

601 This proposition goes back to Hawking (1972) and Hawking & Ellis (1973), §9.3, with dubious proof. The 
approach via topological censorship, introduced by Friedman, Schleich, & Witt (1993), is due to Chrusciel & 
Wald (1994a), Galloway (1995), Browdy & Galloway (1995), and Jacobson & Venkataramani (1995). For the 
topological singularity theorem of Gannon (1995) and Lee (1976) see footnote 311. Overall, see §3.3 in Chrusciel 


302 


Black holes Il: General theory 


Proposition 10.29 Let (M,g) be a stationary asymptotically flat space-time satisfying the null 
energy condition, as well as weak cosmic censorship in the sense that its DOC D is globally 
hyperbolic. Accordingly, let & be a Cauchy surface in D, with 2 = (0,) x S?. Then: 


1. The domain of outer communication D is simply connected. 


2. If the closure & of Ł in M intersects the future event horizon H = in a compact set K, and 
H a is connected, then K = S? and hence H = = R x S? (both meant topologically). 


Both results are very deep and as usual by now, we can only sketch the arguments. The inference 
from global hyperbolicity to simple connectedness is essentially the principle of topological 
censorship, which is proved (by contradiction) by combining ideas from Corollary 10.18 and 
the topological singularity theorem, which all go back to Penrose’s singularity theorem. The 
second claim follows from the first, combined with a result from differential topology:°"? 


Lemma 10.30 Jf N is a compact simply connected 3-manifold with non-empty boundary ON, 
then all connected components of ON must be diffeomorphic to two-spheres 5”. 


The simplest example is the three-ball B? with IB? ~ S*. In the case at hand, Theorem 5.33 
gives D = R x È and hence % is simply connected by part 1 of Proposition 10.29. Since (M,g) 
is asymptotically flat we can cut off & at some large radius r, giving the N of the lemma. The 
component of ON at the asymptotically flat end is © S? (this much is clear even without the 
lemma, which of course confirms it), and the other is UNH P ~ s2 (from Lemma 10.30). 


Towards Theorem 10.28, by stationarity of the metric, the event horizon (which is defined by the 
causal structure and hence by the metric) is invariant under the flow of X (which by definition 
of a Killing vector field consists of isometries), and hence X is tangent to A. Thus X = L-Z 
on H$, where L is tangent to the null generators of H t, and Z is tangent to the spacelike 
two-spheres S? of Proposition 10.29.2. Let $ = E ei jdx'dx! be the Riemannian metric on 
S? (so far meant topologically, rather than metrically), and write L = d/ds and Z = Ez Z'0;. 
Using gis = &s; = 0 (expressing orthogonality of L to the null hypersurface H a , to which L is 
simultaneously tangent!), the Killing equation 2x g;; = 0, given by (2.94), comes down to 


L278 + O:8ijdx'dx! = 0; ðZ =0. (10.139) 


Now the conceptual key to Hawking’s argument is that, (possibly) apart from the isometries 
generated by the “stationarity” Killing vector field X, at least the intrinsic geometry of the 
horizon of a stationary black hole, as determined by g, is also invariant under the flow of L, i.e. 
along its null generators. This will follow from the arguments leading to (10.170) below, which 
give kuy = 0 on the horizon. Eq. (8.23) then gives dsğij = 0, so that (10.139) becomes 


L7% = 0; ð Zİ = 0. (10.140) 


(2020). As noted at the end of §7.3, the null energy condition required in Proposition 10.29 is satisfied by electrovac 
space-times, so we need not assume it separately in Theorem 10.27. Finally, Proposition 10.29 speaks of “the” DOC, 
since by footnote 593 there can only be one asymptotically flat end in M. See also footnote 593. 

602This is Lemma 4.9 in Hempel (1976). A simpler proof, kindly provided by my colleague Ioan Märcut, uses a 
long exact sequence in de Rham cohomology, viz. 0 + H? (ƏN) > H? (N) > H! (N) > H! (ƏN) > H! (N) ---, 
valid in d = 3. By assumption, H! (N) = 0, which gives 0 > R —> H? (ƏN) > H? (N) — 0 as well as H! (ƏN) = 0. 
The former gives dim(H?(N)) + 1 components of IN and the latter makes each of these diffeomorphic to S?. 
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Hence the vector field Z, so far defined only on H, is independent of s and generates (the 
same) isometries on each spacelike two-sphere within 7 ay that is orthogonal to L, i.e. to the null 
generators of the horizon. Now there are clearly two mutually exclusive possibilities: 


e Either Z = 0, in which case X = L is tangent to the horizon, which thereby becomes a 
Killing horizon with respect to X. By highly nontrivial further arguments going under the 
name staticity theorem,°” the stationary case is eventually reduced to the static case. 


e Or Z #0, in which case X + Z = L is tangent to the horizon, making it a Killing horizon 
for a new Killing vector field. By Lemma 10.31 below, Z has periodic orbits on the 
horizon, and by (10.140) the vector fields Z and X commute. This eventually leads to 
the factorization R x U(1) of (asymptotic) time-evolution and rotation. As Hawking 
suggested, the extension of Z and hence of the U(1) symmetry it generates off the horizon 
to all of M can be done via analyticity of the metric, which is why this was assumed. 


Lemma 10.31 Ifa (Riemannian) metric $ of a two-sphere S? (seen as a manifold only) admits a 
nonzero Killing vector field Z, then the orbits of the flow of Z are closed (i.e. periodic). 


Proof.°°® First, like any vector field on a compact manifold, Z is complete and hence has a 


globally defined flow y : IR x S? > S?; as usual we write y(x) = w(t,x), with y : S > S. 


By the “hairy ball” theorem,’ Z has a zero on S?, say at z (for Z = Oy on S? with the usual 
metric, z would be the north pole or the south pole). The tangent map T,y; (= (w,). at z) then 
maps T,S? to itself (for each t € IR), and since each y; is an isometry, 7; y; is an isometry of 
čz. Identifying 7,S* = R? through the choice of an orthonormal basis, we have 7, yw; € SO(2), 
and more precisely, T;y; = exp(tA) for some A in the Lie algebra of SO(2). Consequently, 
T, Wr = id for some (smallest) 0 < T < œ; we may normalize Z such that T = 27. Furthermore, 


yi (exp,(V)) = exp; (Tey (V )). (10.141) 


By Hopf-Rinow, the map exp, : 7,8? — S? is surjective, and hence wr (x) = x for any x € S?. 


603 Such a theorem shows that under the assumptions of Theorem 10.27, where in addition (M, g) is not rotating, 
X is hypersurface orthogonal (Sudarsky & Wald, 2002, 2003; Chruściel & Wald, 1994b; Heusler, 1996, §8.2). 

604 See Hawking & Ellis (1973), Proposition 9.3.6, Chruściel (1996), Lemma 5.2 and Heusler (1996), $8.1. 
Stationary vacuum solutions can only be proved to be analytic (i.e. guy has a convergent power series expansion) 
where X is timelike (Müller zum Hagen, 1970ab), i.e. in the DOC. Attempts to remove analyticity of the metric 
from the assumptions of Theorems 10.27 and 10.28 have led to a program (still in progress) called Kerr rigidity. 
This aims at a different version of the black hole uniqueness theorems, where in compensation for weakening the 
assumptions along the above lines, one also has to strengthen them, in that (so far in the vacuum setting) one tries to 
show that at least stationary solutions to Einstein’s equations that are close to Kerr, in the DOC actually coincide 
with Kerr. See Alexakis, Ionescu, & Klainerman (2014) and Ionescu & Klainerman (2015). Kerr rigidity is to be 
distinguished from Kerr stability, which is the conjecture that generic perturbations of the initial data for the Kerr 
metric lead to a MGHD that is close to the original one (at least in the DOC). This would generalize the paradigmatic 
theorem on the stability of Minkowski space-time (Christodoulou & Klainerman, 1993; Lindblad & Rodnianski, 
2010) to black holes. There is numerical evidence for this (Zilhão et al., 2014), and recent mathematical results 
prove it for a = 0, i.e. Schwarzschild (Dafermos, Holzegel, Rodnianski, & Taylor, 2021), and for small a, both for 
cosmological constant A = 0 (Klainerman & Szeftel, 2021) and A > 0 (Hintz & Vasy, 2018). 

605There is a smallest period Tọ of which all other periods are integral multiples. First, by the period bounding 
lemma (stating that the non-zero periods of a vector field on a compact manifold are bounded from below) there 


are only finitely many periods. Second, by the proof of Lemma 10.30, each periods equals T /n;, for some n; € N. 


Taking ng = LCM (n;,... ng), where T /n; are the periods, it follows that each period is a multiple of T /no. 
606This proof was again kindly provided by my colleague Ioan Märcut. The lemma fails if S? is replaced by for 

example the 2-torus (Kronecker foliation) or the three-sphere (Reeb foliation), cf. Moerdijk & Mrcun (2003), $1.1. 
607 This theorem is often attributed to Brouwer (who generalized it to S?”), but it goes back to Poincaré (1885). 
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At this stage we know that a space-time satisfying the assumptions of Theorem 10.27 is both 
stationary and axisymmetric, in that its isometry group contains IR x U(1), where R at least in 
the DOC gives timelike transformations, whereas U(1) gives spatial rotations around a symmetry 
axis (which consists of all points where the Killing vector field Z generating these rotations 
vanishes).°°* The next step is the circularity theorem to the effect that the distribution orthogonal 
to X and Z is integrable,’ so that, roughly speaking, in suitable coordinates the 2-surfaces 
generated by the R x U(1) action (which in the DOC have constant r and @) are orthogonal to 
2-surfaces having constant t and @. The assumptions in this theorem are automatically satisfied 


when the Ricci tensor vanishes, so for simplicity we restrict ourselves to the vacuum case.°!? 
The circularity theorem brings the metric into the so-called Papapetrou form, 
g=—pre* di? +e" (dp + Adt)* + ee" (dp? +. dz’), (10.142) 


in coordinates (t, p,z, @) resembling the usual cylindrical coordinates (x = p cos 9, y = p sing, z), 
where the functions A, A, and h only depend on (p,z). In terms of the 2d Riemannian manifold 
(2,8) defined by © = R x (0,2), coordinatized by p > 0 and z € R, and g = e” (dp? + dz’), 
solving the vacuum Einstein equations Ruy = 0 then comes down to solving the elliptic PDE 


XV,(pV'E) + pViEVE =0, (10.143) 


called the (vacuum) Ernst equation, for the complex Ernst potential E = —X + iY, with X > 0. 
Here i = 1,2, and V is the covariant derivative defined by the 2d metric g; ie Namely, if we know 
E we find A from X = exp(—2A), whereas A and h come from solving the first-order PDEs 


JAE Far; ðph = 5, (9pEpE - 0,E0,E); (10.144) 
0,4 = —F,apY: Ach = 125 (dp EOE + 0,E dp). (10.145) 


Eq. (10.143) is subject to boundary conditions dictated by the assumptions in Theorem 10.27,°!! 
and the last difficult part of the proof of Theorem 10.27 is to show that these conditions precisely 
allow the Kerr metric (or, in the electrovac case, where (10.143) has extra terms, the Kerr- 
Newman metric) and no other solutions. This has been done in at least four different ways, none 
of which is easily explained.°'* The general point, though, is that through stationarity and its 
consequence axisymmetry, the vacuum Einstein equations have been reduced to a 2d elliptic 
boundary value problem, which can be completely controlled and gives the desired uniqueness. 

These results are very impressive and should suffice for stationary (i.e. long-term) astrophysi- 
cal situations. However, from a theoretical point of view it should be stressed that couplings to 
other forms of matter than electromagentism typically do give “hair” to black holes.°!* 


608 This set is non-empty by Lemma 10.30 and Poincaré’s hairy ball theorem used in the proof of Lemma 10.31. 
60% Compare with Lemma 10.24.2, which in the spherically symmetric case gives 2-surfaces with constant r and t 
that are orthogonal to 2-surfaces with constant @ and @, viz. the leaves of the S?- foliation of Lemma 10.24.1. 
610For all that follows, see Carter (1979, 1986), Heusler (1996), and Chruściel, Lopes Costa, & Heusler (2012). 
élln order to resolve the ambiguity that p = 0 both at the horizon and at the zeros of the rotation generator Z, 
these boundary conditions are usually stated in terms coordinates (x,y) instead of (p,z), constrained by so as to lie 
in the semi-strip x > C := m—2QJ and —1 < y < 1, and defined by p = \/(x* —C*)(1 — y?) and z = xy. In terms 
of these, the 2d metric becomes dp? + dz? = dx? - (x? — y?) / (x? — C?) + dy? / (1 — y?) ), and x — C is the horizon, 
whereas y —> +1 is the symmetry axis. See Carter (1986), p. 106, or Heusler (1996), p. 55. For example, asymptotic 
flatness gives boundary conditions as x — co, namely x-7X = (1—y?)(1+O(1/x)) and Y =2Jx(3—y*)+ O(1/x), 
where J is a constant. As y— +1, we have X, 0,Y, and 0,Y all O(1—y?), and d,X = c+O(y? — 1) for some c > 0. 
612These are reviewed, with references to the original literature, in Chrusciel, Lopes Costa, & Heusler (2012). 
613 See e. g. Volkov, 2018, and references therein, as well as, again, Chrusciel, Lopes Costa, & Heusler (2012). 
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10.11 The Penrose inequality 


The final state conjecture mentioned in §10.9 has an interesting and testable consequence known 
as the Penrose inequality.°'* Because of its paramount role in the final state conjecture, the 
Penrose inequality is often seen as a test of weak cosmic censorship. The logic seems to be: 


final state conjecture = — weak cosmic censorship = Penrose inequality, 
where at least Penrose himself seemed clearly interested in the contrapositive implication 
violation of Penrose inequality, =. violation of weak cosmic censorship. 


To motivate the inequality, let us first define the area Ax of a Kerr black hole in the regime 
0 < |a| < m, where weak cosmic censorship holds (see Theorem 9.3 in §9.7 and §10.6), as the 
area of its (future) event horizon H a at some fixed value vo of the lightlike coordinate v = v4 
defined above (9.136). Since r = r+ at this horizon, one may also say that ¢ is fixed, and hence 
that Ax is the area of the intersection of H a with some spacelike “wannabe” (i.e. partial) Cauchy 
surface &. Since the metric is stationary, this area is in fact independent of v of t or %. We obtain 


T 27 m 20 
A= | ao | dg y/deth(vo,+.8,9) = | ao | dp ./Z(r+) sind 
0 0 0 0 
T 20 
= ao | dp (ri. +a?)sind =An(r} +a”) = 8r(m’ +mvy m? —a?), (10.146) 
0 0 


where h is the metric on the set HE N {v = vo} induced by the Kerr metric (9.137). Here we 
used (9.113), of which only the first term remains, since A = 0 at r = r4, cf. (9.117). For the 
Schwarzschild black hole, in which a = 0, with rs = 2m this simply gives 


As = 4ary = 16m’. (10.147) 
For the Kerr metric, still assuming 0 < |a| < m and hence weak cosmic censorship, we have 
Ax < 16am’. (10.148) 


This fact about Kerr space-time is the key to the Penrose inequality. It gives a positive lower 
bound on the (asymptotic) mass of a black hole in terms of the area of some spatial cross-section 
of its event horizon, and the inequality is saturated by the Schwarzschild metric. 

Suppose the Kerr black hole is the final state of a gravitational collapse process. At some 
earlier time r, captured by a spacelike hypersurface %, = %, assuming weak cosmic censorship, 
an event horizon has formed with spatial or cross-sectional area A, = Ay at time f, that is, 


Ay, := Area(H NX). (10.149) 


Here H = is the event horizon of the space-time of the collapsing matter as defined in (10.79). 
By Hawking’s area law (10.160), to be discussed in the next section, the area can only increase 
during the collapse process, so that Ay < Ax, since Ax = A. is now seen as the horizon area at 
t = œ, where the collapse has been completed and a stationary Kerr black hole has been formed. 


614 The original source is Penrose (1973), to be recommended also for its figures. Penrose describes weak cosmic 
censorship and the final state conjecture, both of which rank among his most visionary contributions to GR, as ‘the 
establishment viewpoint’, and sees his inequality as ‘an attempt to derive a contradiction with this viewpoint.’ The 
subject can be traced through the reviews by Bray (2002), Bray & Chrusciel (2004), Mars (2009), and Lee (2019). 
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On the other hand, the mass m, = my of the black hole at time 7 (the “now” of %) can only 
decrease through gravitational radiation, i.e. m; > m = mæ. Thus (10.148) gives A, < 16m}, or 


Ay < l6rmz, (10.150) 


where, in the presence of possible asymptotic momentum, the asymptotic mass is defined by 


my := \/ (T10)2 — ||T1]2. (10.151) 


Here II? and II are defined by (8.103) and (8.109), respectively, and the spacelike hypersurface 
È is supposed to carry initial data (3,k) satisfying the inequality (8.110). Since this implies 


T? > |Ñ] (10.152) 


by the (generalized) positive mass theorem, the mass my in (10.151) is well defined, and 
(10.151) simply reflects the basic formula p? = ,/|p|? + m? from relativistic mechanics. Thus 
the assumption (8.110) on the initial data will be made throughout this section.°!> This is not as 
threatening as it sounds, since in the main case of interest, i.e. the static case, one has k = 0 and 
hence (8.110) just comes down to non-negative scalar curvature, i.e. R> 0. See below. 

Eq. (10.150), then, is a first version of the Penrose inequality. However, since the event 
horizon H = has the disadvantage explained at the end of $10.3, which makes Ay effectively 
uncomputable from the initial data on %, the meaning of the inequality must be modified. The 
idea is to replace Ay, by some computable number Ay resembling the area of a spatial cross- 
section of the event horizon, in such a way that Ay, < Ay. The redefined Penrose inequality 
would then become Ay, < 16mm, and although this is weaker than (10.150) and hence its proof 
would give less information than a proof of (10.150), a violation of the weaker version, i.e. 
Ay, > 162m7, would still falsify cosmic censorship, as Penrose intended in Popperian spirit. 

A natural candidate to replace the “absolute” event horizon (10.79) is the apparent horizon. 
Its definition relies on the notion of an outer trapped surface, which is predicated on the possibility 
of defining an outer direction among the pair of lightlike vectors (L,L) emanating from a closed 
spacelike surface S, as defined in §6.3. This can be done, for example, if the given space-time 
(M,g) has a non-compact spacelike (full or wannabe) Cauchy surface X and S is such that SC £ 
and X\S = U UV with U compact and V non-compact, so that S$ separates È into an inside part 
U and an outside part V. In that case, the outer lightlike vector field L is selected by g(L,n) > 0, 
where n is the outward normal to S within È (i.e. n points towards V). This applies if (M,g) is 
asymptotically flat (cf. Definition 8.4) and S is the boundary of a region that does not extend to 
the asymptotic end X°*' of £ (we assume there is only one such end). We only consider closed 
spacelike surfaces S with this property, which are simply called surfaces in what follows. 

In the presence of a preferred outer direction we write (L*,L~) for (L, L), normalized to 
g(Lt,L~) = —2 as in (6.58), and similarly write (@*,@~) for (0,0). If SC UCM, with È 
a spacelike hypersurface, as above, as usual we write N for the fd normal to the embedding 
2 — M, normalized by g(N,N) = —1, and denote the corresponding extrinsic curvature by k. 
Furthermore, let n be the outward directed normal to the purely Riemannian embedding S > %, 
with extrinsic curvature k. Generalizing (6.59) - (6.60), one has 


L* =N+n; 0* = Trs(k) £Trs(k), (10.153) 


615 Eq. (8.110) follows if the constraints (8.65) - (8.67) as well as the dominant energy condition E > ||P|| hold. 
But since the matter content is not specified, the constraints are usually not imposed in the Penrose inequality. 
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where = Trs(k) is the trace of the pull-back of k to S under the embedding S > È, and Trs(k) 
might as well have been written Tr (X) since k was already defined on S in the first place.°!® 
The following definition may look unnecessarily complicated, but that’s the way it is:°!7 


Definition 10.32 In the above circumstances, a surface S C uC M is: 
e future outer trapped if 0* < 0, cf. (6.74) and (6.96); 
e weakly outer trapped if OF < 0, 


e marginally outer trapped if 9* = 0, in which case we call S an MOTS. 


1. The outer trapped region outer trapped region T C È is is the union of the interiors 
of all weakly outer trapped surfaces in 2. 


2. The apparent horizon of M within & is A$ := OT, 


In the asymptotically flat case, it can be shown that the apparent horizon is smooth and is an 
mots,°!® which by definition encloses all weakly outer trapped surfaces in X. For stationary 
black holes the apparent horizon AS coincides with H a NÈ, which therefore is an MOTS. 

For example, for the Schwarzschild metric the property 0* = 0 easily follows from eqs. 
(6.96), and (9.49), and (9.44). In fact, we also find @~ = 0, either by computation, or because 


0- = -9+ (10.154) 


in static space-times. This follows from (10.153), since now k = 0 and hence 9+ = +Trs(k). 


Proposition 10.33 An MOTS SC È C M ina static space-time (M, g) has lightlike expansions 
ee) =Q (10.155) 


In particular"? S is a minimal surface in the 3d Riemannian manifold X. 


Proof. This follows immediately from the definitions and from eqs. (10.154) and (6.85). 


This proposition suggests that in general space-times MOTSs are Lorentzian analogues of 
minimal (hyper)surfaces in Riemannian geometry, which partly explains their enormous interest. 
More importantly, Proposition 10.33 is the key to the Riemannian Penrose inequality below. 

We return to the general (i.e. not necessarily static) case. A slight variation of Corollary 10.18 
shows that the apparent horizon lies within the event horizon.°° But this does not mean that 
Area(A¥ je Area(H = N £), and hence, taking the left-hand side to be Ay. and the right-hand 


side as Ay, the desired inequality Ay, < Ay fails. This may be remedied as follows.©?! 


616See e.g. Minguzzi (2019), §6.4, for a derivation of (10.153) and similar results. 

61 Hawking & Ellis (1973), 89.2, pp. 319-320, define the apparent horizon as the boundary of the outer trapped 
region, which they define as the set of all points x € È that lie on some outer trapped surface. However, this faces 
problems with the smoothness of the boundaries involved. See Andersson & Metzger (2009) and Chrusciel (2020), 
$8.4. Andersson, Mars, & Simon (2008) and Galloway, Miao, & Schoen (2015) are references on MOTSs. 

618See Andersson & Metzger (2009), Theorem 7.3. 

619 A minimal surface in a Riemannian manifold (locally) minimizes the volume functional, which is the case iff 
its mean extrinsic curvature vanishes. See e.g. Jost (2002), §3.6. Euclidean space IR? only has non-compact minimal 
surfaces; beside the (affine) planes, one has interesting examples like the catenoid and the helicoid (see Wiki). 

620 See Proposition 9.2.8 in Hawking & Ellis (1973) and Theorem 3.3.18 in Chrusciel (2020). 

621 One could sharpen this definition to make mae(S) unique, but this is not necessary for S = Ay : since any of 
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Definition 10.34 For any surface S C È (in the above sense), a minimal area enclosure mae(S) 
is a surface such that mae(S) D S, and Area(mae(S)) < Area(S’) for all surfaces S' DS. 


Thus we replace AŻ by mae(Ay;), which exists and, being an extremizing surface, saturates the 
inequality 0+ < 0. Thus mae(A¥) is an MOTS. Taking S = Ay and S’ = H$ NX we see that 
Area(mae(A¥ )) < Area(H} NE), (10.156) 


as desired. Hence we may take Ay = Area(mae(A¥.)) in our earlier discussion. Thus we put: 


Definition 10.35 For any asymptotically flat initial data set (2,8,k) satisfying (8.110), with 
associated asymptotic mass (10.151) and apparent horizon Ay , the Penrose inequality is 


Area(mae(Ay)) < 16rmz.. (10.157) 


In the static case, i.e. k = 0, the following simplifications take place (cf. Definition 8.4); 


1. The initial data set (X,8,k) becomes an asymptotically flat Riemannian manifold (%,£); 
2. The assumption (8.110) on the initial data becomes R > 0, where R is the Ricci scalar; 


3. The apparent horizon Ay becomes the outermost minimal surface Ay, in È, i.e., the 
(unique) minimal surface such that no other minimal surface in & properly encloses Ay.” 


4. In computing the area there is no need for the minimal area enclosure. 


One can see this for the spatial Schwarzschild metric, provided one uses the radial coordinate p 
instead of r, since the metric (10.117) is singular at the place of interest r = 2m, see (10.122). By 
spherical symmetry it is enough to consider radial perturbations: the area 4zr( p)? is minimized 
iff p = m/2, which corresponds to r = 2m and hence recovers the apparent = event horizon.” 


Theorem 10.36 Any complete asymptotically flat 3d Riemann manifold (2,8) with R > 0, with 
asymptotic mass my, = II? defined by (8.103), satisfies the Riemannian Penrose inequality 


Area(Ax) < 167m, (10.158) 


where Ay is the unique outermost minimal surface in & (assumed to have one end only).°~* 


Furthermore (“rigidity”), equality in (10.158) holds iff the region outside Ay is isometric to the 
part r > 2m of the Schwarzschild space (%,,&s) defined in and above (10.117). 


We have to refer to the literature for a proof of this.” Meanwhile, the general case in Definition 
10.35 seems out of reach (it has been proved only for spherically symmetric space-times). 


its minimal area enclosures is an MOTS, by definition mae(Ay ) encloses, and hence must coincide with, any rival. 
Technically, mae(Af ) is not just an MOTS but an outermost MOTS, which is unique if it exists. Even its possible 
non-uniqueness would not affect the Penrose inequality (10.157), since any two candidates have the same area. 
622 Equivalently, Ay = 0(U{U € @(Z) | AU is a minimal surface}), where “surface” is meant as explained above 
(10.153). See Theorem 4.7 in Lee (2019) for existence and uniqueness of Ay. It can be shown that provided R > 0, 
outermost minimal surfaces are two-spheres (Meeks & Yau, 1980). Note that dim(%) = 3 throughout this section, 
but mutatis mutandis result like this are valid up to dim(%) < 8 (at n > 8 smoothness of Ay turns out to be lost). 
623Note that the spatial Schwarzschild metric &5 is complete on the full space Xs = R?\{0} on which it is defined. 
624This assumption can be dropped by taking the outermost minimal surface with respect to some given end. See 
Lee (2019), Conjecture 4.12, which also generalizes the inequality to arbitrary dimension. 
625 See Huisken & Ilmanen (1997, 2001) and Bray (2001), as well as the reviews cited in footnote 614. 
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10.12 Epilogue: The laws of black hole thermodynamics 


Around 1970, it was noted by various people that the following dictionary made some sense:°*° 
Thermodynamics Black Holes 
equilibrium state © stationary metric g 
temperature T surface gravity K 
entropy S horizon surface area A 
energy E asymptotic mass m 
other conserved quantities | (Komar) asymptotic quantities 


The basis for this analogy lies in the following three laws of black hole thermodynamics:°~’ 


Zeroth law: The surface gravity is constant on each connected component of the event horizon. 


First law: For simplicity taking just one conserved quantity into account, viz. angular momentum J, 
K 
ano = dm— Ons, (10.159) 


where Oy is a constant at the event horizon (playing the role of a chemical potential). 


Second law: Hawking’s area law,” i.e. 


ôA > 0. (10.160) 


These were initially seen as laws of black hole mechanics. Despite powerful arguments by 
Bekenstein, the possibility of a true thermodynamic underpinning was even explicitly denied:°”° 


626 See Thorne (1994) and Weinstein (2021) for some of the history of black hole thermodynamics; pioneering 
papers include Christodoulou (1970), Christodoulou & Ruffini (1971), Penrose & Floyd (1971), Hawking (1972), 
Bekenstein (1972, 1973, 1974), and Bardeen, Carter & Hawking (1973), which stated all four laws. 

627 The zeroth law of classical thermodynamics states that (thermal) equilibrium is an equivalence relation, which 
is what allows the introduction of temperature T in the first place, and then implies that T is constant in thermal 
equilibirium. The first law (or, if seen as “conservation of energy”, a consequence thereof) is TOS = ÔE +); UiðQ;, 
where the Q; are the relevant conserved quantities and the u; their (generalized) chemical potentials (for example, 
we count volume V as one such Q;, with u; = p). The second law is 65 > 0, one of the great mysteries of physics. 
We omit the third law, which states that K cannot be brought to zero by a ‘finite sequence of operations’ (Bardeen, 
Carter & Hawking, 1973) or ‘within a finite advanced time’ (Israel, 1986). This idea is physically ambiguous if not 
disputable and also lacks a clear connection with the usual version of the third law of thermodynamics, to the effect 
that the entropy is zero at zero temperature (which would even be violated by extremal black holes). 

628 Continuing footnote 485 on the history of the definition (10.79) of the “absolute” event horizon, i.e. HÈ = 
OT (JF), left as something between Hawking and Penrose: in Seife (2021, p. 478 of e-book) Penrose recalls a 
telephone conversation he had with Hawking in 1970 in which they discussed the area law including the crucial 
role of the definition of the horizon, which Hawking proposed to Penrose but followed this with: ‘it was your idea’ 
Penrose adds: ‘I don’t know what he thought. Maybe he thought I had the idea but didn’t quite have it. It’s not clear. 
I don’t know what the story was, really. I never wanted to bring it up. Because it was a big thing for him.’ 

©29We read in Seife (2021), chapter 13, that the word “mechanics” in the title “The four laws of black hole 
mechanics’ of Bardeen, Carter, & Hawking (1973) was a deliberate provocation against Bekenstein (whose name 
they even misspelled as Beckenstein), who first proposed that the analogy between the pertinent laws for black holes 
and the laws of (ordinary) thermodynamics was more than a purely formal one, and hence has physical content. 
Especially Hawking, who had discovered his singularity theorem and the second law, among other things, and 
(perhaps with hindsight) was well on his way to fame and fortune, initially responded quite harshly to Bekenstein, 
who at the time was just a PhD student (though not an entirely powerless one, as his supervisor, Wheeler, who at the 
time had a significant if not controlling influence on the Western GR community, took his side). 
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It should however be emphasized that k/87z and A are distinct from the temperature and 
entropy of the black hole. In fact the effective temperature of a black hole is absolute zero. 
(Bardeen, Carter, & Hawking, 1973, p. 168) 


However, within a year Hawking made a remarkable U-turn, which changed physics. On the 
basis of a calculation in quantum field theory in curved space-time involving pair creation near 
the horizon, he predicted the radiation now named after him, which turns a black hole into a 
black body and (allegedly) shows that the laws of black holes mechanics are genuinely laws of 
black hole thermodynamics. Hawking’s calculation also allowed the explicit identifications 


kpc? 
S= —A=A/4; 10.161 
4Gh i ( 
h 
kgT = Inc = K/27, (10.162) 


called the Bekenstein-Hawking entropy and the Hawking temperature, respectively.©°° For 
example, for a Schwarzschild black hole, where k = c*/4mG = 1/4m, for the latter we obtain 
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Tombstone of Stephen Hawking’s grave in Westminster’s Abbey, containing his ashes. At his 60th birthday 
in 2002, Hawking requested equation (10.163) to be engraved on his tombstone. 


Note the spectacular combination of fundamental constants,°! hidden by the use of “natural” 
units G = h = c = kg = 1 on the right in (10.161) - (10.162). For a Schwarzschild black hole 
one has A = Anrz with rs = 2Gm/c? = 2m, and hence its dimensionless entropy equals 


S/kg = 4nGm? /hc (= 107"). (10.164) 


630Bekenstein (1973) gave a similar formula for the temperature of a Kerr black hole, whose Schwarzschild limit 
differs from Hawking’s by a multiplicative constant. Hawking’s formula first appeared, in the form (10.162), in 
Hawking (1974). Thus also the temperature T is sometimes named after both Bekenstein and Hawking. 

631 See e.g. https: //www.vttoth.com/CMS/physics-notes/311-hawking-radiation-calculator. 
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We will not discuss the microscopic justification of black hole thermodynamics, since unlike 
black hole thermodynamics itself, all attempts to underpin it use quantum (field) theory or string 
theory etc. and hence blast the framework of classical GR.°” Within the classical setting, it 
cannot be overemphasized how closely this topic is related to black hole uniqueness theorems 
(cf. the previous two sections), since the key to both classical thermodynamics and black hole 
thermodynamics lies in the fact that just a few “emergent” parameters control the situation. 
Black hole thermodynamics comes with all the problems of classical thermodynamics, 
starting with the question what the symbol “6” in the first and second laws is supposed to mean. 
Amidst the huge literature on thermodynamics and its foundations, mathematical physicists 
typically prefer the axiomatization of Lieb and Yngvason.’ This is restricted to equilibrium 
states and transitions between these through processes that fall under the following heading: 


Adiabatic accessibility: A state Y is adiabatically accessible from a state X (...) if it is 
possible to change the state from X to Y by means of an interaction with some device (which 
may consist of mechanical and electrical parts as well as auxiliary thermodynamic systems) 
and a weight, in such a way that the device returns to its initial state at the end of the process. 
(Lieb & Yngvason, 1999, p. 17) 


In contrast, the weight may have risen or fallen (in a gravitational field). This incorporates 
the original thermodynamic idea of a cycle, but avoids the equally traditional but mysterious 
concept of “heat”, which in any case is problematic for black holes. The definition of adiabatic 
accessibility also includes what in black holes thermodynamics is called the “physical process” 
interpretation, in which one studies what happens if things are thrown into a black hole.°** 
Thus the “6” in the laws of black hole thermodynamics should, in principle, refer to changes 
in selected properties of a black hole metric (viz. its asymptotic mass, angular momentum, 
possibly charge, and spatial horizon area) if, due to some intervention, the metric evolves from 
one stationary value to another. Unfortunately, this only seems to apply to the first law (which is 
predicated on the zeroth). A typical application of the second law, mentioned from the beginning 
by Hawking and others, is the merger of two black holes into one, whose area, then, is greater 
than or equal to the sum of the areas of the original constituents.°° Since multi-black hole 
metrics are typically unstable (except for the charged Majumdar—Papapetrou metric studied in 
§ 10.9) it seems impossible to see this merger as an adiabatic evolution of the said type.°*° 
This, and in fact the whole theory, suggests that each of the laws of black hole thermo- 
dynamics is valid in its own unique setting, and that there is not, as far as we know, a single 
setting—formalized as a set of mathematical assumptions-in which all laws are valid. This nature 
of black hole thermodynamics as patchwork will be reflected by the following discussion, which 
starts out historically and then, in vain, tries to converge to a more systematic presentation. 


632 Surveys of black hole thermodynamics include Wald (1994, 2001), Jacobson (1996), Compére (2006), Bravetti 
(2014), Carlip (2014), Curiel (2014b), Dougherty & Callender (2016), Wall (2018), and Wallace (2018, 2019). 

633 See Lieb & Yngvason (1999), which (as they note) was partly inspired by Planck (1926) and Giles (1964). For 
the connection with (classical) statistical mechanics see also Martin-Löf (1979) and Uffink (2007). 

634See Wald (1994) and Gao & Wald (2001). This interpretation complements Bardeen, Carter & Hawking (1973), 
who study (asymptotic) parameter changes of unidentified origin for ‘two slightly different stationary axisymmetric 
black hole solutions’, clearly inspired (via the uniqueness theorems) by the Kerr-Newman metric. Their proofs 
suggest that these changes always pass through other such solutions, whereas the “physical process” interpretation 
makes no such assumption, as long as eventually a new equilibrium state = stationary metric is reached. 

635 Hawking (1972) shows that the opposite process, i.e. a bifurcation of one black hole into various black holes, 
cannot happen (even if it were compatible with the area law). See also Hawking & Ellis (1973), Proposition 9.2.5. 

636Black hole mergers are the source of the gravitational waves detected on earth (Castelvecchi, 2020). 
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With hindsight, black hole thermodynamics started with the Penrose process discussed in 
§9.6. In the spirit of “E = mc?” we may opportunistically rewrite the inequality (9.153) as 


dm—O.,d6J > 0, (10.165) 


where m is the mass and J = am is the angular momentum of the (Kerr) black hole, cf. (9.120). 
Eq. (10.146) now gives Smarr’s formula as well as, with more work, its variational form 


AK =m—204J; Ak = 2(8m- OLE), (10.166) 
where the surface gravity K, at the outer (event) horizon is given by (10.110). To derive the 
second part, one should first express K+ and ()+ in the independent variables m and J via (9.120), 
(9.143), (9.117), and (10.110), and then put them back at the end of the calculation. 

Clearly, eq. (10.166) is a special case of the first law of black hole thermodynamics, and if we 
combine it with (10.165) we also obtain an example of the second law (10.160), i.e. the Hawking 
area theorem, at least in the special case that the change of area is caused by the Penrose process. 

A more general argument for (10.160) is obtained by turning Proposition 6.14 on its head.6?7 
Originally intended to prove that the assumption 0 (x) < 0 leads to cusps or focal points in the 
null hypersurface C defined by (6.61), if we instead assume that C is smooth and that its null 
generators are future complete, then 0 (x) < 0 leads to a contradiction, so that under the stated 
assumptions we must have 8 (x) > 0. We apply this to the case where C is some component 
of the event horizon H pr of a black hole region as defined in (10.79), which is not necessarily 
stationary and may even consist of various components, that is, of various black holes (which 
may merge). The structure of these components is described by Proposition 10.16.1, which 
shows that C is ruled by future inextendible lightlike geodesics y. In order to apply Proposition 
6.14 (contrapositively) we now need to argue that each component C is smooth (which is not 
automatic, since from Proposition 6.18 we merely know it is locally Lipschitz). This should 
follow from additional assumptions, such as some form of weak cosmic censorship that prevents 
the inextendibility of the lightlike geodesics that rule C to be caused by incompleteness. 6?’ 

It is also assumed that one can foliate at least the relevant part of space-time by partial (i.e. 
“wannabe”) Cauchy surfaces %&,, which intersect each component C of H aa in a two-sphere S; (cf. 
Proposition 10.29). In other words, each S; is a spatial cross-section of some component of the 
event horizon, whose area Area(S;) is defined by (6.75). As already remarked, smoothness of 
C then enforces 0 > 0. Under the assumptions of Proposition 6.14, notably the null curvature 
condition, eq. (6.76) then implies that each area Area(S,) and hence also their sum Ay, i.e. the 
total area of H A N%,, can only increase with time (or stay the same). We may write this as 


En CI (Ap) => Area(H} N£, ) < Area(H?z O Xp). (10.167) 


This is a precise version of the second law, from which the problematic “6A” has been removed. 


637 See Hawking (1972) and Hawking & Ellis (1973), 89.2. Various sets of assumptions are known not merely for a 
rigorous proof of the area theorem, but even for its formulation, since the lack of smoothness of the horizon (which 
a priori is only locally Lipschitz, cf. Proposition 6.18) requires conditions making the area (6.75) well defined. See 
Chrusciel et al. (2001) for a detailed analysis of various assumptions, including the corresponding proofs. The 
simplest-though by no means the weakest-assumption is global hyperbolicity of the conformal completion (M, ê) 
of the given asymptotically flat space-time (M, g), see Definition 10.1, along with future completeness of the null 
generators of the horizon H. E and validity of the null curvature condition (as in Theorem 6.15) on I~ ( Z+). 

638 In Hawking’s proof this form was strong asymptotic predictability, which roughly speaking means that 


I~(_ 4 +) is contained in a globally hyperbolic region (Hawking & Ellis, 1973, p. 313). See also Wall (2009), §1.2.3. 
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Turning to the (later) zeroth law, the key observation was that under various assumptions one 
can sharpen Proposition 10.21 to constancy of the surface gravity K on the entire Killing horizon. 
The relevance of this result follows from Hawking’s rigidity theorem in $10.10, which makes the 
event horizon a Killing horizon. The simplest such result is as follows.®°? 


Proposition 10.37 The surface gravity K is constant and nonzero on each component of a 
bifurcate Killing horizon Hx, and differs at most by a sign on different components thereof. 


Proof. As in the proof of Proposition 10.21, from (10.106) and (10.104) we obtain 


LaK? = ef Oy? = —R% geh XP -V* Xa, (10.168) 
where J = 1,2 and the spacelike unit vectors ey form an orthogonal basis of the orthogonal 
complement of X at each T;Hx, x € Hx (cf. Lemma 4.16). Since X = 0 at the bifurcation surface 
S, it follows from (10.168) that x? is constant on S. Since different lightlike geodesics ruling Hx 
emanate from different points of S, eq. (10.100) implies that x? is constant on Hx altogether. 
To show that x Æ 0, we note, proving by contradiction, that x = 0 and (10.106) imply 
VuXy = 0 on S, because the spacelike contractions in (10.106) vanish by themselves and hence 
the total expression is negative semidefinite.°*’ Hence V uXy and X, both vanish on S, but this 
implies that X is identically zero (so that it could not be lightlike, see 85,3),6*1 


The zeroth law exemplifies the fact that the laws of black hole thermodynamics may be 
derived under various inequivalent assumptions, since the original version was as follows: °*? 


Proposition 10.38 Jf the Einstein equations (7.1) and the dominant energy condition (7.65) hold, 
then the surface gravity K is constant on any (necessarily connected) Killing horizon Hx. 


Proof. If the null generators of Hx have tangent vectors L, then by Lemma 4.16 we have on Ak: 
x=f:L, (10.169) 


where f is some function defined on Hx. From (10.99) and VzL = 0, we find k = Lf. Assuming 
that Hx is sufficiently smooth, the Frobenius condition (8.94) for (null) hypersurface orthogo- 
nality of the Killing vector field X and (6.88) then give kuy = 0 on Hx, so that also 0 = 0 and 
Ouv = 0 on Hx. The null Raychaudhuri equation (6.98) then gives 


RuyX?X” =0. (10.170) 


This also follows by noting that the area Area(S,) of a stationary black hole must be independent 
of t, so that (6.76) gives @ = 0 and hence Ò = 0, after which (6.98) again yields (10.170). 


639 The assumption of a bifurcate Killing horizon is not very heavy; Rácz & Wald (1996) give arguments ‘supporting 
the view that any spacetime representing the asymptotic final state of a black hole formed by gravitational collapse 
may be assumed to posses a bifurcate Killing horizon or Killing horizon with vanishing surface gravity’ (the latter 
occurs in extremal Kerr and Reissner-Nordström black holes, whose existence astrophysicists deny). 

6400n S we have V,,X# = er (daX" + TigX?) =0+0=0, as X vanishes on Sand eřðg = er is tangent to S. 

641 Any isometry y of M is determined at least locally (i.e. in a convex nbhd of x) by its tangent map y.. at some 
fixed x € M: to find y(y), take the unique geodesic y from x to y, so that y = exp,(Y) for some Y € T,M, and if 
y is an isometry, then y(exp,(Y)) =exp,(yi(Y)). Infinitesimally, this implies that any Killing vector field X is 
determined by X (x) and V „Xy (x). See also Chruściel (2020), Proposition 4.3.10, for a direct proof of this. 

642 See Bardeen, Carter & Hawking (1973), $2, as well as Wald (1984), §12.5, and Chrusciel (2020), $4.3.4. 
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Eq. (10.170) holds on Hx, where, using the Einstein equations (7.1), it implies that Tu yX HX” = 0. 
Hence the vector T (X), with components T (X)! := T# XY, is orthogonal to X, since 


g(T(X),X) =T" X" =0. (10.171) 


Therefore, by Lemma 4.16 this vector is either spacelike, or lightlike and hence proportional 
to X, or zero. On the other hand, since X is lightlike and hence causal, the dominant energy 
condition (7.65) forces T(X) to be causal or zero. This excludes the possibility that T (X) is 
spacelike and hence it must be null, all of this on Hx only. Again invoking Lemma 4.16, we 
conclude that T (X) must be proportional to X. Using (7.1) in the opposite direction gives 


X’ AR(X) =X’ A(T(X)—1T-X) =0. (10.172) 
The final step in the proof is the following equality, which as usual in this proof is valid on Hx: 
X’AR(X) =dK AX. (10.173) 

To prove this, we note that since X is a Killing vector field, eq. (10.99) is equivalent to 
XFV Xp = —KXy, (10.174) 


cf. (3.74). Eq. (10.173) follows by applying the antisymmetrized expression Xj, Vo] to both 
(10.174) and (8.94) and carrying out some lengthy but straightforward rearrangements. Eqs. 
(10.172) and (10.173) yield dk AX = 0 on Hx, which forces dk = 0 on Hx. 


We now return to the first law of black hole thermodynamics (10.159), which “morally” reads 


TOS = ÔE —OndJ. (10.175) 


where we identify m = E and regard the constant Qy as a generalized chemical potential. The 
mass/energy m/E and the angular momentum J of a black hole are defined by the Komar 
formulae (9.118), see below. Referring to the Penrose process discussed above for at least an 
example of the “physical process” interpretation of (10.175), we now give a derivation based 
on the idea that the 6’s indicate that one gently moves the metric from one stationary value to 
another through intermediate stationary metrics. This derivation is based on the Hamiltonian 
formalism of GR.°* Like Proposition 10.37, the argument requires a bifurcate Killing horizon, 
but as argued after (10.108) and in footnote 639 this is not really a very strong assumption. 

More seriously, the proof relies on Hawking’s rigidity theorem (i.e. Theorem 10.28 in § 10.10), 
to the effect that the event horizon is a Killing horizon for a Killing vector field 


X = ð; + QHðo, (10.176) 


which generalizes the one for the Kerr metric, cf. (9.149). This means that this particular 
derivation of the first law also requires the assumptions of the rigidity theorem, which include, 


643 We follow Sudarsky & Wald (1992). The existence of the spacelike surface © used in the proof was proved by 
Chrusciel & Wald (1994b). The equivalence between the Hamiltonian (ADM) versions of the asymptotic mass and 
angular momentum (which only holds in stationary asymptotically flat space-times) is discussed in Jaramillo & 
Gourgoulhon (2009); see also Poisson (2004), §4.3 and Gourgoulhon (2012), chapter 8. A more elegant derivation 
can be given from the Lagrangian formalism and its associated covariant Noether charges, originally developed by 
Kijowski & Tulczyjew (1979). See Wald (1993), Iyer & Wald (1994), and Jacobson, Kang, & Myers (1994). See 
Gao & Wald (2001) and Poisson (2004), §5.5.3 and §5.5.4 for “physical process” derivations of the first law. 
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for example, a version of weak cosmic censorship and imply, among other things, that (M,g) is 
axisymmetric, with Killing vector field dg (the special case X = d,, i.e. the Killing vector field 
defining stationarity, is just Qy = 0). In general, Qy is some constant (interpreted as the angular 
velocity of the black hole) chosen so that indeed g(X,X) = 0 at the event horizon, see §10.10. 
Recall eqs. (8.214) - (8.222) from the Hamiltonian approach to GR, in which we take & to be 
a spacelike surface whose boundary at one end is the given bifurcation surface ./, and at the 
other end is a two-sphere S? as in (8.126), where we eventually let r — œ; for simplicity we just 
write this boundary component as S2,. Using manipulations similar to those in the derivation of 
(7.44) but now in one dimension lower, we may rewrite the boundary Hamiltonian (8.220) as 


A(X) = i „Tr (L(V! gij — Vig) + 25/%);) =: Hp(S2.) + HBl I), (10.177) 


where L is the lapse and S = Sd; is the shift, so far arbitrary. The trick is to choose these as 
IN+S=X, (10.178) 
cf. (8.5), where N is the future-pointing normal to % as usual. This has the effect that 


anil 
0: ee. (10.179) 
I 


O8ij _ 
ot 


since the time evolution generated by (10.178), i.e. the flow of X, consists of isometries. 


Now consider variations of Hg(), see (8.218) and (8.221), induced by one-parameter 
families (homotopies) g7 j and 7,’ that satisfy the constraints (8.225). Then the variations 


dži j and 


qs C70: bz) = (s =0) (10.180) 


68ij := 
satisfy the linearized constraint equations, i.e. C0 (ô, 57) := dCo($°, Ts) js=0 = 0, etc. Then 
SHg(Z) =0 (10.181) 


by (8.221). On the other hand, 6Hg(X) consists of a bulk term giving the equations of motion 
(8.227) and (8.228) and a boundary term. By (10.179) the former vanishes, and so we must have 


5Hp(X) =0. (10.182) 


The Killing vector field X in Theorem 10.28 is such that at (spatial) infinity we have N —> 1 and 
S — d / dp, and hence by definition of these quantities the integral over SŻ in (10.177) gives 


5Hp(S2,) = 162(5E —Oy5J), (10.183) 


cf. (8.126).°* To compute the integral over the bifurcation surface ./, we perform a partial 
integration and realize that by definition X = 0 and hence L = $ = 0 on Z. This simply gives 


ôHg( F) =— L do; (0;L) (87g! — a g) Sg, (10.184) 


64The factor 167 comes from the fact that the Einstein—Hilbert action (7.2) should really be al 167G) f R. 
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where d?o; = d?o - n, with n the inward normal to .Y within E (see footnote 646), and 
d’o = d’z, /det(£(z)), (10.185) 
as in (8.220). Since X = 0 and hence L = 0 on -Y, cf. (10.178), we have d;jL = njVnL, so that 
Sml A=- I, d?o (VaL) -(g —nkn!) 58, = 28 L doVnL, (10.186) 


where the second equality follows as in (7.34), with the additional remark that (8% — nn! )gyı 
is the “covariant” expression for the metric g on Z, as in (7.38) but in one dimension higher 
(throughout this derivation, 6 acts only on g and 7, not on L and S). We now use the identity 


2 | o (Val —kijn's!) = - f douv vexV, (10.187) 
S S 


which is valid on any 2-surface S, and relates Hg to the Komar definition of a conserved quantity 
defined through any Killing vector field X, cf. (9.118). This time the surface element is 


dow = (XpXy —XyX,) d’o, (10.188) 


where X is seen as a lightlike vector on the Killing horizon and hence is complemented by 
another lightlike vector X orthogonal to .”, cf. §6.3, which unlike (6.58) is normalized such that 


OOO? Up e-1: (10.189) 


Furthermore, ki; is given by (8.24), but in view of (8.210) we can ignore the term k;;n'S/ since 
S/ = 0 on the bifurcation surface S = ./. Using (10.188), (10.99), and (10.189) we obtain 


S 


as the second term in (10.188) gives zero because X"X,, = 0 implies XYVFX, = 0, and we have 
taken K out of the integral since it is constant by the zeroth law of black hole thermodynamics, 
i.e. Proposition 10.37.646 Since ô in (10.186) only acted on the surface element, we find 


Hg( Z) = —2K6A. (10.191) 


Eqs. (10.182), (10.183), (10.186), and (10.191) then recover (10.159), which using the identifi- 
cations (10.161) and (10.162) is the first law of black hole thermodynamics (10.175). 

The situation covered by this proof of the first law is quite different from its original Penrose 
process context, in which particles were thrown into a Kerr black hole. Perhaps because it is the 
frontier of fundamental physics, black hole thermodynamics is also a gallimaufry of ideas. 


645 See e.g. Jaramillo & Gourgoulhon (2009), eq. (16). This is a fairly easy exercise. See also footnote 646. 

646 See Poisson (2004), §5.5.3. This shows that V„L = x, as stated without proof in Sudarsky & Wald (1992). The 
sign of K depends on the branch of the bifurcate Killing horizon, and hence on the sign of the normal n to .Y within 
È. It can be checked on the example (10.93) in 4d that one needs the inward normal, which gives the minus sign in 
(10.184), since Stokes’s theorem (i.e. partial integration) uses the outward normal. In this example, the bifurcation 
surface is x = t = 0, i.e. the y-z plane, and L = x. The correct normal for the future horizon x = t, where K = 1, is 
n = 0/0x, which is directed inward, and indeed we duly obtain V,L = dx/dx = 1 = K. 
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A Lie groups, Lie algebras, and constant curvature 


This appendix contains material supporting $4.4 on spaces of constant curvature, but is also 
interesting elsewhere. Its content underpins much of mathematics and mathematical physics.°*” 


A.1 Lie groups 


We only need real linear Lie groups, which are closed subgroups of GL„(R), i.e. the group of 
real invertible n x n matrices, with group multiplication simply given by matrix multiplication.°* 
For example, SO(3) is the subgroup of GL3(R) consisting of matrices R that satisfy 


R’R=13; (A.1) 
det(R) = 1. (A.2) 


More generally, for some given T € GL, (R), the matrices y € GL,(IR) that for all x,y satisfy 


(yx, yy) = (Ty), (A.3) 


or, in other words, leave the bilinear form (x,y)r = (x, Ty) invariant (where (-,-)) is the usual 
inner product on R”), form a linear Lie group Gr. In other words, 


Gr = {y€GL,(R) |YTy=T)}. (A.A) 


For n = 3 and T = 13 we obtain Gr = O(3), which has has two components: the one containing 
the identity is SO(3) = O(3) +, singled out by det(R) = 1, whereas the other component O0(3) _ 
consists of those elements R € O(3) with det(R) = —1. Note that SO(3) is connected but not 
simply connected. Furthermore, O(3) and SO(3) are compact in the topology inherited from 
M3 (IR) & IR’: this follows from the following parametrization of SO(3), with a, B, y € [0,27]: 


cosy —siny 0 cosB 0 -sinß 1 0 
R,= | siny cosy 0 „Rg = 0 1 0 Ri, = | 0 cosa —sina 
0 0 1 sinB 0 cosß 0 sina cosa 


Staying in n = 3 for the moment, instead of T = 13 we may take T = diag(—1,1,1). Then 
Gr = O(1,2) is called the Lorentz group (in space-time dimension 3). It has four components, 
singled out by the four combinations of the two independent conditions 


det(A) = 1; +100 > 0; (A.5) 


for an indefinite matrix T like this it is customary to label the entries A; j by i, j = 0, 1,2 instead 
of 1,2,3. In particular, the identity component O(1,2)o satisfies det(A) = 1 and Ago > 0.0” 
Consequently, even the subgroup SO(1,2) = {A € O(1,2) | det(A) = 1} has two components. 


647 References for this appendix are Helgason (1978), O’ Neill (1983), Vinberg (1993), and Wolf (2011). 

648 Such Lie groups are not necessarily closed in M,, (IR), since invertibility of matrices is an open condition (we 
call a condition open if its solution set is open, and closed if its solution set is closed). For example, the sequence 
gn = In/n in GL, (R) converges to zero, so the limit is not in GL, (IR). The topology used may either be the usual 
one on IR” or the matrix norm topology; these are equivalent. 

649 This follows from the fact that any matrix A € O(1,2) satisfies Ab — A = As, = 1, so that |Ago| > 1, and from 
the fact that sgn(Ago) and det(A) are continuous functions on O(1,2). 
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Another important difference with S 0(3) is that S o(l, 2) is non-compact. This follows, for 
example, from the following parametrization of O(1,2)o, where a € [0,27] and B,y € R: 


coshy sinhy 0 coshB 0 sinhB 1 0 
B,= | sinhy coshy 0 ‚Ba = 0 1 0 ‚Ra=| 0 cosa —sina 
0 0 1 sinhB 0 coshß 0 sina cosa 


From these, one obtains the matrices A with det(A) = 1 and Ago < 0 by multiplication with 
diag(—1,—1,1), those with det(A) = —1 and Ago > 0 by multiplication with diag(1,—1,1), and 
finally, those with det(A) = —1 and Ayo < 0 by multiplication with diag(—1, 1,1). 

More generally O(k,1) C GL(k +/,IR) is the linear Lie group defined by n = k + l and 


Ferdi: (A.6) 


hence elements of O(k,/) are matrices y € GL(k + 1, R) that satisfy y’ gy = g. Of course, the 
Lorentz group O(1,3) is crucial for special and general relativity, but apart from k = 1 we will 
also be interested ink = 0 and k = 2. We write O(/) for O(0,/), which is compact, but none 
of the groups O(k,/) with k > 0 is compact, except when / = 0, in which case O(k,0) = O(k). 
Each group O(/) has two components (distinguished as for / = 3 by the sign of their determinant, 
or, equivalently, by being orientation-preserving or reversing), whereas each O(k,/) with k > 0 
and / > 0 has four, distinguished by their containing /, —/, T (time reversal), and -T (parity). 

The additive (and hence abelian) groups IR” are also real linear Lie groups (although this is 
not their simplest description!), since one may identify a € R” with the 2n x 2n-matrix 


_ [{ In diag(a) 
a=( ey j (AN) 


where diag(a) is the diaginal n x n matrix with entries (a1,...,an) on the diagonal. Indeed, 
matrix multiplication reproduces addition in diag(a). The last Lie groups of interest to us are 


E(n) = O(n) x R”; (A.8) 
P(n) = O(1,n-1)x R”, (A.9) 


called the Euclidean group and the Poincaré group in dimension n. They are the isometry 
groups of the Euclidean metric 6 = diag(1, bien) 1) and Minkowski metric 7 = diag(—1,1,...,1) 
on R”, respectively. These are examples of semidirect products, which are defined more 
generally as follows. Let some group L act linearly on a vector space V. Then the operation 


(A,v)- (A.V) = (AA v+a-v); (A.10) 
Ow) Sa (A.11) 


turns L x V into a group, called the semidirect product of L and V. If L C GL,(R) is a linear 
Lie group and V = R”, then L x V is a linear Lie group in GL2,(IR), realized by the matrices 


L v 
( 01, ): (A.12) 


where v € GL, (R) is the matrix with v € V in every column. 


Lie algebras 


319 


A.2 Lie algebras 


Abstractly, a Lie algebra over R is defined as a real vector space A equipped with an bilinear 


map |-,-]: A x A — A that satisfies antisymmetry and the Jacobi identity, i.e., 
[a,b] + [b,a] = 0; (A.13) 
la, [b,c] + [c, [a,b] + [B, [c,a]] = 0 (a,b,c € A). (A.14) 


Concretely, any linear subspace g C M„(R) that is closed under the commutator 
[A,B] := AB — BA, (A.15) 


which automatically satisfies the Jacobi identity, is a Lie algebra (and similarly over the complex 
numbers). A special case is the Lie algebra of a linear Lie group G C GL„(R), subtly defined by 


g = {A € M, (R) |e4 € GYt € R}, (A.16) 


where the exponential map exp : g — G is just given by its usual (norm-convergent) power series. 


It is a nontrivial fact that this concrete Lie algebra is also an abstract one, notably that g is a 
vector space and that the bracket (A.15) indeed maps g x g to g. The former property follows 
from the Lie product formula 


ATE lim Gase (A.17) 


M—® 


combined with the axiom that G be closed in GL„(R). The latter property derives from 
d An tA 
[A,B] = He Be”, (A.18) 


combined with a lemma about matrices showing that if g € G and A € g, then gAg™! € g, which 


in turn follows from the definition of the exponential, implying exp(gAg~') = gexp(A)g™!. 


If G = Gr is defined by (A.4), then its Lie algebra is 


gr = {A €M,(R) | ATT =-TA}. (A.19) 
For example, taking T = diag(1,1,1), the Lie algebra so(3) of SO(3) consists of all real 3 x 3 
matrices X that satisfy XT = —X. Asa vector space so(3) = R3, since so(3) has a basis 
00 0 0 0 1 0 -1 0 
J=ļ| 00 -1 |], J = 0 00], J3=| 1 0 O |, (A.20) 
010 -1 0 0 0 0 0 


whose linear span gives all 3 x 3 real antisymmeric matrices. A vector space isomorphism 


R so(3) is then given by (x,y,z) > xeı + yez + ze3. The commutators of these elements are 
[Ji J2] = Js; [J3,J2] = Si; Y3,Jı] = Jo, (A.21) 


and by linearity these determine the Lie bracket of arbitrary elements of so(3). 
Similarly, according to (A.19) the Lie algebra of SO(1,2) consists of all A € M3(IR) that satisfy 


A’ diag(—1,1,1) = —diag(—1,1,1)A. (A.22) 
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There are good reasons for taking the basis 


00 0 00-1 010 
e=ļ|00 -1 |], e = 0 0 0 |, ee =| 10 0 |], (A.23) 
010 -10 0 000 
with commutation relations 
[e1,€2] = es; [e3,€1] = e; le3,e2] = eı. (A.24) 
For later use, also another basis 
el = e3; el = — e7; eh =€] (A.25) 
is useful, with Lie brackets 
[ei eh] = —e3; le}, e1] = e5; [e3,e5] = —e'. (A.26) 


Although SO(2, 1) is isomorphic to SO(1,2), its Lie algebra has a different basis, e.g. 


00-1 010 000 
fiz} 000), p=|-100], fA=loo1ı|, (A27) 
-10 0 000 010 


with commutation relations 


LA. fol = —fs; [B fil] = fz B, fa) = fi: (A.28) 


The last interesting three-dimensional cases are the Lie algebras of the groups (A.8) and (A.9) in 
n = 2. To find the Lie brackets in a suitable basis, we note that in general the Lie algebra g of a 
semidirect product L x R” is t R” as a vector space, with commutators given by 


[(A,v), (B,w)] = ([A,B], Aw — Bv), (A.29) 
where A,B € land v,w € V. Since SO(2) consists of all matrices 


cosa —sind 
sing@ cosa 


): a € [0,27], (A.30) 


we may take the basis 


; 1 ‘ 0 , 0 -I 


the former forming a basis of IR’, and find the commutation relations from (A.29) to be 
[J1 j2] = 0; Ya. ji] = ja. a» j2] = —j1;- (A.32) 


For the Poincaré-group in 2d, i.e. P(2), on the other hand, we take 


ui) meld) a am 


to obtain the commutation relations 


[k1,ko| = 0; [k3,k1] = ko; [k3,k2] = ky. (A.34) 
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A.3 Homogeneous manifolds 


Spaces of constant curvature are special cases of homogeneous manifolds (and more specifically 
of symmetric spaces). To start, we quote the following basic technical result without proof: 


Proposition A.1 Let G be a Lie group and H C Ga closed subgroup. Then H is itself a Lie 
group and there exists a smooth structure on the homogeneous space G/H such that: 


1. dim(G/H) = dim(G) — dim(H); 
2. The canonical projection G + G/H, y> YH, is smooth; 
3. The canonical G-action G x (G/H) + G/H, (y,%H) > (yıy)H, is smooth. 


We write such group actions as Y(%»H) = (Yıyp)H. It is clear that G acts transitively on G/H 
(for any x € G/H and y € G/H there is y € G such that y = yx). Without loss of generality,°' 
we may also assume that G acts effectively on G/H: if yx = x for all x € G/H, then x = e. 

Homogeneous spaces arise naturally if a Lie group G acts smoothly and transitively on 
a manifold M (in which case M is called a homogeneous G-space). Then M = G/H with 
H = Gy (i.e. the stability group of some fixed x’ € M), under the diffeomorphism M > G/H, 
x YH, where y € G satisfies yx’ = x; the inverse map G/H — M is yH > yx’ (both maps are 
independent of the choice of y € YH), and this identification M + G/H is G-equivariant. 

The following isomorphism will be very useful in all that follows: 


Ty(G/H) = g/b, (A.35) 


where g and § are the Lie algebras of the Lie groups G and 7, respectively. To see this, let us 
consider a more general situation, where a Lie group G acts smoothly on a manifold M, that is, 
P:GxM—M is a smooth G-action on M. We will write @,(x) (or simply y-x) for @(Y,x), so 
that each map @,:M — M is a diffeomorphism. For each A € g we define a map 


d 
2 f (el -x) <0. (A.36) 


This defines a derivation on C*(M) and hence a vector field on M, so that Ay € X(M), and we 
have a map A ++ Ay from g to X(M). It can be shown that our map has good properties: 


Am :C”(M) — Der(C”(M)); Amf (x) = 


Proposition A.2 The map A > Ay is linear and for all A,B € g satisfies 
(Au, Bm] = —|A, Blu. (A.37) 


In other words, our map is an anti-homomorphisms of Lie algebras (with respect to the usual 
commutator bracket of vector fields). Clearly, at any x € M we obtain a map g — 7,M by 
regarding Ay (x) as an element of T,M. In the case M = G/H at hand, let us now take x = H. 


Lemma A.3 The linear map g++ Ty(G/H) defined by (A.36) has kernel b. 


650See e.g. Kobayashi & Nomizu (1963), Proposition 1.4.2, or Helgason (1978), $II.4. 

6511 G does not act effectively on G/H, take the largest normal subgroup Ho C H that is also normal in G, and 
define G* = G/ Ho and H* = H / Ho. Then G/H & G* /H* and G* acts effectively on G*/H*. An example where 
this is necessary occurs if H C Z(G), in which case all of H acts trivially on G/H. Fortunately, the isometry group 
of a (semi) Riemannian manifold always acts effectively on M. 

652 See e.g. Marsden & Ratiu (1994), Proposition 9.3.6. Note that Ay = —@, (A), cf. (8.246). 
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Proof. If A € h, then exp(tA) € H by definition of h (see §A.2). But hH = H for any h € H, 
whence Ag/y(H) = 0. Hence 6 lies in the kernel of the map g > Ty(G/H). Conversely, 
yH =H iff y € H, and h € H lies near the identity iff h = exp(tA) for some A € b. 

Lemma A.3 implies (A.35) by a dimension count based on Proposition A.1.1, which gives 


dim(G/H) = dim(G) — dim(H) = dim(g) — dim(h) = dim(g/h). (A.38) 


The isomorphism (A.35) gets more body of we combine it with the residual H-action on 
Ty(G/H). For any diffeomorphism @ of M, the derivative @, maps T,M linearly to Ty(,)M. If 
p(x) =x, then pg, € Hom(T,M). If the diffeomorphisms pọ come from a G-action on M, then 


G,={vyeG|y-x=x}. (A.39) 
is the stabilizer of x. If y € Gx, the linear maps Py : T,M — T,M, combine into a homomorphism 
Tx : Gy > GL(T,M); Y= Py (A.40) 


called the isotropy representation of G, in T,M (here GL(T,M) consists of all invertible linear 
maps from 7,M to T.M). This applies in particular to M = G/H and x = H, so that we obtain 


Ty : H > GL(Ty(G/H)); k= gh. (A.41) 


We will now explicitly find 7 under the isomorphism (A.35). We know that any group G acts 
on itself by the adjoint action Ad,(6) = yö y—!. If Gis a Lie group,°” this action defines a 
representation Ad’ of G on its Lie algebra g, defined by Ady (X ) = yXy !. This action may, of 
course, be restricted to H C G, and it is easy to see that this restriction quotients to g/. In our 
application to spaces with constant curvature, g will have a reductive decomposition 


Gg =H Op, (A.42) 


where (trivially) not only h, but also p is invariant under Ad, for any k € H (if H is connected, 
this is equivalent to [h,p] C p). In that case, we may replace the isomorphism (A.35) by 


Ty (G/H) =p, (A.43) 


Proposition A.4 1. Under the isomorphism (A.35), the isotropy representation (A.41) of H 
on Ty(G/H) maps to the adjoint action of H on g/b (still denoted by Ad’): 


Ty (k) [A] = [Ad;,.(A)], (A.44) 
where A € g and [|A] € g/b, seen as an element of Ty(G/H) via the isomorphism (A.35). 


2. Consequently, under the isomorphism (A.43), assuming that p is Ad'(H)-invariant, the 
same isotropy representation of H is mapped to the adjoint action of H on p. 


Indeed, for any A € g, k € H, and f €C”(G/H) we have, cf. (A.36) and (A.41), 


d d 
(tH (k)Agyn)f(H) = fe -H)\=0 = El -H)\:=0 
d -1 
= —_f{eF H)\=o = (Ad, A) Gf (B). 


dt 


653]t follows from our definition of a Lie algebra in Appendix A.2 that Ad’ is well defined as well as linear. 
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The following examples of homogeneous spaces arose in $4.4, restricted to dimension two: 


S? = 0(3)/0(2); dS? = 0(1,2)/0(1,1); (A.45) 
R? & E(2)/0(2); R? & P(2)/O(1,1); (A.46) 
R? = 0(1,2)/0(2); AdS? = 0(2,1)/0(1,1), (A.47) 


where we put p = 1 (and also s? = $7, etc.), so that the non-flat spaces in question are given by 


S? =| (41,2535) SR? |x hs tt = 1); (A.48) 
dS? = { (x0,x1,x2) CR? | = +x? +25 = 1}; (A.49) 
H? = {(x0,x1,x2) E€ RÌ | ag +x? +35 = —1}; (A.50) 

AdS? = { (x—1,x0,x1) E IR? | X, —x8 +44 = 1}, (A.51) 


and the Lie groups in question were defined in Appendix A.1. Ind = 2 de Sitter space dS? is 
diffeomorphic to anti de Sitter space AdS?, but they will be different as Lorentzian manifolds, as 
in the case R? in (A.46). To verify (A.45), let O(3) act on S? by restricting its defining action on 
IR’, and take x’ € S? to be the north pole (0,0, 1), in which case the O(2) in (A.45) consists of 
rotations around the z-axis and reflections in planes through the origin that contain the z-axis.°>* 
Similarly, for dS?, where one also takes x’ = (0,0,1), and for H? and AdS”, where the most 
convenient fiducial point is (1,0,0). For (A.46), we let the 2d Euclidean group E(2) and the 2d 
Poincaré group P(2) act on R° in the defining representation, and take x’ = (0,0). 

Writing (A.45) - (A.47) generically as M = G/H, where H = O(2) or H = O(1,1), the 
Ad’ (H)-invariant decomposition (A.42) applies to each G in the list. In all six cases we have 


g = R3, bER, p= R’, (A.52) 


as vector spaces, taking h to be the linear span of the third generator and p to be the linear span 
of the first two generators: see (A.21) for the Lie algebra of O(3) as relevant for S?, see (A.24) 
for SO(1,2) in the context of dS”, see (A.26) again for SO(1,2) but now applied to H?, then 
(A.27) for O(2, 1) applied to AdS?, then (A.31) for E(2) applied to IR? in Riemannian signature, 
and finally, eq. (A.34) for P(2) applied to IR? in Lorentzian signature. All cases give 


[b,p] C p. (A.53) 


Lemma A.5 In all six cases the decomposition (A.42) is Ad’(H)-invariant. In addition, under 
the last isomorphism in (A.52) the adjoint H-action on p is just its defining action on R°. 


The significance of this observation will become clear in the next section. The proof is long! 


Proof. Let u: G — GL(V) be a representation (i.e. a homomorphism) of a Lie group G on a 
finite-dimensional vector space V. For A € g we define a linear map du(A) : V > V by 


d 
du(A)v = 7" (e) Vit=0- (A.54) 


654Note that O(2) has two components, like O(3), again distinguished by det = +1. Elements y with det(y) = 1 
are rotations whereas those with det(y) = —1 are reflections in a line through the origin (e.g. diag(1,—1)). 
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Letting A € g vary, this construction gives a linear map du : g — Hom(V), which satisfies 
[du(A),du(B)] = du([A,B)); edulA) — y (e^). (A.55) 


In particular, if G is connected, then u can be recovered from du via (A.55). If G is simply 
connected, this even gives an equivalence between finite-dimensional Lie group and Lie algebra 
representations. For example, the adjoint representation Ad’: G — GL(g), Ad'(y)A = yAy |, 
defines a Lie algebra homomorphism ad : g — Hom(g),°°° where ad = d(Ad’), namely 


ad(A)B = [A,B]. (A.56) 
In view of this, for G = 0(3), the commutation relations (A.21) show that 
ad(J3)Jı = Ja; ad(J3)J2 = —Ji, (A.57) 


where we repeat that J3 is the generator of the subgroup O(2) of O(3) that consists of rotations 
around the z-axis. This means that as a matrix relative to the basis (J1, J2) of IR, the restriction 
of the linear map ad(J3) : so(3) — s0(3) to p = span(Jı,J2) & IR? (which restriction is well 
defined, as the above relations show) is just the usual generator of so(2), see (A.31), which 
is obtained from the defining action id of G = O(2) on V = IR’ by the procedure (A.54). By 
exponentiation, we then conclude that the corresponding Ad-action of SO(2) on p is the defining 
action, too. To obtain the Ad-action of all of O(2) it suffices to take the element 


0 


0 1 


- OO 


1 
1 . 
R= ( k ) , which, seen as an element of O(3), is Rz = | 0 - 
0.0 
the reflection in the x-z plane. Its adjoint action on the generators of R? can be computed to give 


Adr, Ji = RJ Ry = I; Adp,.J2 = Re hR = J, (A.58) 


which means that the adjoint representation of Ry on p = span(J1, J2) is not only well defined in 


mapping p to itself, but also that under the identification p S IR’, Ad), maps R, to itself. 


However, the (J1, J2) basis of IR? is not the geometrically natural basis in the given context. 
Instead, we compute the map (A.36), first at arbitrary points (x1,x2,x3) € S?. This gives 


a] a] a] a] 
Jh eae J3 => =x: <— + x1 <— (A.59) 


d d 
“4 = u >= Ox3 Ox, Ox? 


At the point x’ = (0,0, 1) € S?, the generator J3 is mapped to zero and for the other two we have 


d 0 
he =) J= =—. A.60 
1 T m ( ) 
Hence the natural basis for 7,5? S R? is uw; = (1,0) = J2 = ð /ðxı and u = (0,1) = —Jı = 
d /0x. Fortunately, this leads to exactly the same conclusions as the previous basis, as the 
reader can easily verify. This proves Lemma A.5 for S?, i.e., G = SO(3) and H = SO(2). 


655Each map ad (A) is even a derivation of g as a Lie algebra, as follows from the Jacobi identity. 
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For hyperbolic space H?, that is, G = SO(1,2) and H = SO(2), we use the basis (e4,€4, €4) 
of the Lie algebra of O(1,2) defined in (A.26) and the fiducial point x’ = (1,0,0) € H?. Then 
d 
TEAD TEEN A.61 
mere Ox] “at OX2 an) 


in coordinates (x9,x1,x2) on IR3, so that for H? the right basis of R?, seen as p = span(e/,e5), 
is simply u] = e} and uz = ey. The Lie brackets (A.26), now reinterpreted as 


ad(e)e, = e); ad(e})eh = —e', (A.62) 


then lead to the same conclusion as for S?: the Ad-action of SO(2) on R? is the defining action. 


This is also true for the component that is not connected to the identity; the matrix R, shown 
above is now embedded into SO(1,2) as diag(1,1,—1), but the result remains Ad(R,) = Ry. 
For Euclidean R?, i.e. G = E(2) and H = O(2), p is literally IR, seen as the Lie algebra of 


the second factor in the semidirect product (A.8), and the lemma should be evident from (A.32). 


For de Sitter space dS? we need similar computations as for S* and H?. Using the basis 
(A.24) and the fiducial point x’ = (0,0, 1) on dS, we find 


d 
= =e A.63 
el a ez iG ( ) 
in coordinates (x9,x1,%2) on RÌ, making ug = (1,0) = —e and u; = (0,1) = —e; the natural 
basis of p ® IR’. This gives ad(e3) uo = —[e3, €2] = —eı = uy and ad(e3)u; = —[e3,e1] = —e2 = 


ug. which implies that ad(e3) is the matrix k3 in (A.33), coming from the 2d boost generator 


( coshy sinh% 


sinhy coshy ) er en 


Thus the adjoint k3-action on p generates the defining O(1,1)o-action on IR’. This time there are 
three other components that contribute to the full O( 1, 1)-action, generated by the matrices 


1 0 -1 0 -1 0 
(19) rel) el) as 


which as elements of O(1,2) under O(1,1) C O(1,2) are given by 


1 0 0 -1 0 0 a -1 0 0 
P=|0 -1 0], Te 0 10], PT = 0 -1 0], (A.66) 
0 0 1 0 0 1 0 0 1 


Then Ad(P)uo = —Pe.P~! = —e2 = ug and Ad(P)u,; = —Pe,P~! = e} = -uı, which show 
that Ad(P) = P on p = R°. The other two cases are similar, which settles the case of dS”. 
For AdS?, in coordinates (x_1,x%0,x1) with fiducial point x’ = (1,0,0), we obtain 


FE a rer 
x1 XO 
which is similar to (A.63). Indeed, using the basis uo = (1,0) = — f2 and uw; = (0,1) = — fı, and 
the Lie brackets (A.28), we obtain ad(f3)uo = —|f3, fo] = —f = uı and likewise ad(f3)u) = 
—(f3, fi] = —f2 = uo, so that once again ad( f3) is given by the matrix k3 in (A.33), which 
generates the 2d boosts in (A.64). We leave the verification that the discrete elements (A.65) 
of O(1,1) also act correctly on p = span( fi, f2) to the reader; their embedding in O(2, 1) is 
different from (A.66), and is now given by always having +1 in the upper left entry. 

Finally, the case of Minkowski IR*, with G = P(2) acting on R? and H = O(1,1), is very 
similar to the Euclidean case, with eqs. (A.33) - (A.34) replacing (A.31) - (A.32). 
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A.4 Symmetric spaces 


So far, this was an exercise in Lie group theory and differential geometry. We now bring in a 
metric. The relationship between homogeneous spaces and metric geometry is twofold: 


1. Given M = G/H (always connected), one may study possible G-invariant metrics g on M. 
2. Given (M,g), one may find out if the isometry group of g possibly acts transitively on M. 


In general, a metric g on M is invariant under a diffeomorphism @ of M if 
0'g=8 & 8o(x)(Pr(X), PL(Y)) = 8X, Y) Vx € M,X,Y € TM. (A.68) 


The set of all such diffeomorphisms @ is the isometry group of (M,g), denoted by Iso(M, g). If 
M is a G-space, we say that g is G-invariant if @,g = g for all y € G. If this is the case, then 
G C Iso(M,g) by definition (typically without equality). If in addition G acts transitively on M, 
we say that (M, g) is a homogeneous (semi) Riemannian manifold, so that M = G/H. 

We return to the second point in the next section. The first is settled as follows: 


Proposition A.6 17. There is a bijective correspondence between G-invariant semi- 
Riemannian metrics on G/H and Ad'(H)-invariant metrics on g/b (in the sense of 
Definition 2.5) and hence, if (A.42) applies, on p. 


2. If H = Gr for some Gr C GL„(R) as defined in (A.4), and the Ad’(H)-action on g/b (or, 
if applicable, on p) is equivalent to the defining action of Gr on IR", then g/b (etc.) has a 
unique Ad'(H)-invariant metric (up to scaling by a nonzero constant), and hence G/H 
has a a unique G-invariant semi-Riemannian metric (up to scaling by a nonzero constant). 


Proof. To prove the first claim, just use (A.35) or (A.43): any metric on g/ or p defines a metric 
g on Ty(G/H), which the G-action then pushes to any other point. Invariance under G clearly 
requires Pg = gu for any k € H, so that Proposition A.4 shows that Ad’(H)-invariance of the 
inner product is necessary. It is a simple exercise to show that it is also sufficient.°°® 

For the second claim, any metric g on R” takes the form 


g(x,y) = (x,Ay)r = (x, TAy), (A.69) 


for some A € GL„(R) (to see this, regard metrics as symmetric quadratic forms). Gr-invariance 
of (—,—)r gives (yx,y)r = (x, y~ !y)r, so that Gr-invariance of g, i.e. g(yx, yy) = g(x,y) for all 
x,y € IR" and y € Gr, is equivalent to [A, y| = 0 for all y € Gr. Since the Gr-action on R” is 
irreducible, Schur’s lemma gives A = A -id, for some A # 0, so that g(x,y) = A(x,y)r. 


Proposition A.7 For any Riemannian or Lorentzian manifold (M,g) and G C Iso(M,g), the 
isotropy representation 1,(G,) defined in (A.40) is injective. 


Proof. Near x, any isometry @ of M is determined by its tangent map ¢}. at some fixed x € M: to 
find @(y) for y in a normal nbhd U,, assume y = exp,(Y) for some Y € T,M. If @ is an isometry, 
then p(exp,(Y)) = exp,(@{(Y)). Injectivity of 7 then follows from (A.40). 


656See e.g. Proposition 3.1 in Kobayashi & Nomizu (1969) or Proposition 11.22 in O’Neill (1983). 
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Our proof of Corollary 4.11 in the main text is based on the concept of a symmetric space.’ 


Definition A.8 7. A (semi) Riemannian manifold (M, g) is locally symmetric if each x € M 
has a normal nbhd U, and an isometry ly : U, —> U, with the following properties: 


L(x) =x O =e (A.70) 


2. It is called symmetric if, for each x € M, the above properties hold for Ux = M, 


Such a map ly is often called a geodesic reflection, since (A.70) is equivalent to 
l.(exp,(X)) = exp,(—X), X € exp, (Ux) CTM. (A.71) 


Eq. (A.71), and hence also (A.70), gives 1? = idy,. Eq. (A.71) easily implies (A.70), and the 
converse implication follows from the fact that, as just mentioned, near x a local isometry @ is 
determined by its tangent @/ at x. In view of the assumptions in Corollary 4.11, we note:°>® 


Lemma A.9 If (M,g) is complete and simply connected and is locally symmetric, then it is 
symmetric. Conversely, a symmetric space is complete. 


The connection between symmetric spaces and spaces with constant curvature will run via: 


Lemma A.10 A space (M,g) is locally symmetric iff VRiem = 0. 


The implication “=” is a simple exercise. For the converse, take x,y € M and let F : T,M > T,M 
be a linear isomorphism. If U, and U, are normal nbhds of x and y, we obtain a map 


f: Ux > Uy; f= erp oF oexp,'. (A.72) 


It follows from the Cartan-Ambrose-Hicks theorem that if F preserves both the metric and the 
Riemann tensor, and in addition VRiem = 0, then f is an isometry.°>? If x = y, then F := —idr,m 
trivially satisfies the assumptions of this theorem, simply because both g and Riem have even 
rank (namely 2 and 4, respectively). The ensuing map f is our desired local isometry ly. 


Proposition A.11 The isometry group Iso(M,g) of a symmetric space acts transitively on M. 


Moreover, already its identity component Iso(M, g)o acts transitively on M. 


Proof. First assume that any two points y,z of M may be connected by a geodesic y (in the 
Riemannian case this is true by Lemma A.9 and the Hopf—Rinow theorem). So let y = y(0) and 
z= Y(T). Then y = /,(z) for x = y(T/2), and we recall that /, is an isometry. In general, the 
same argument applies to each segment of a chain of geodesic segments connecting y and z. This 
argument can be iterated to connect y to z via a composition of arbitrarily many small geodesic 
reflections, each contained in Iso(M,g)o, which yields the second claim. 


657 See Helgason (1978), passim, Kobayashi & Nomizu (1969), chapter IX, and Joos (2002), chapter 5. 

658 See Kobayaski & Nomizu (1969), Corollary VI.7.9 and Theorems XI.1.2 and 1.3. 

659 See Kobayaski & Nomizu (1963), Theorem 7.4. The Cartan-Ambrose-Hicks theorem states that f is a (local) 
isometry iff F preserves g and Riem, and for all Y € T,M such that exp,(Y) € U, one has 


Riemexp (rin) (Py (U ), Py (V ), Pr (W), Py (X)) = Riem,xp (r) (U,V,W,X) 


for all U,V,W,X € Top, cy)» where Py : Top (ryM > TM > Texp,(F(x))M is the composition of parallel transport 
along the geodesics yy (traversed backward) and Yr(y). This condition is automatically satisfied when VRiem = 0. 
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A.5 Classification of spaces with constant curvature 


Our proof of Corollary 4.11 consists of three steps, of which we state the first two as a lemma: 


Lemma A.12 1. If (M,g) has constant curvature, it is locally symmetric. Consequently, if 
(M,g) is simply connected, complete, and has constant curvature, then it is symmetric. 


2. If (M,g) is symmetric, then it is a homogeneous (semi) Riemannian manifold. 


Therefore, among (semi) Riemannian manifolds we have the following implications: 
constant curvature > symmetric > homogeneous. 


This lemma reduces the classification problem of spaces with constant curvature to a problem 
in Lie groups and Lie algebras, which we will discuss and solve. The second part of the above 
lemma is a restatement of Proposition A.11. For the first part we return to the proof of Proposition 
4.7, namely Riem, = k(x)S. Taking the covariant derivative with respect to an arbitrary vector- 
field U € X(M) gives VyRiem = (Uk) - S, since VyS = 0 by definition of the Levi-Civita 
connection (which gives Vyg = 0). Eq. (4.23) then gives, for arbitrary X,Y,Z € X(M), 


(Uk) - (g(Z,Y)X — 8(Z,X)¥) + (Xk) - (g(Z,U)Y — g(Z,Y)U) 
+ (Yk) -(g(Z,X)U - g(Z,U)X) =0. (A.73) 


The first part of the lemma then follows from Lemma A.10.°” 


Hence under the assumptions of Corollary 4.11 we have M = G/H, with G = Iso(M i g) (or 
G = Iso(M,g)o), and H = Gy for some x’ € M (or its identity component Ho). By Proposition 
4.10, the given G-invariant (constant curvature) metric g on M is entirely determined by some 
suitable inner product (-,-) on g/b, and by Proposition A.4 the H-action on TyM is mapped 
to the Ad'(H)-action on g/b (which by implication preserves (-,-)). By Proposition A.7 the 
representation Ad’ is injective on H so if we choose an orthonormal basis of g/h with respect to 
(-,-), and hence obtain an identification g/h Z R”, we may also identify H = Ad’(H) with a 
certain subgroup of O(n) in the Riemannian case, or of O(1,n — 1) in the Lorentzian case. 


Lemma A.13 If, in the situation just described, (M, g) has constant curvature and G = Iso(M, g), 
then H = O(n) in the Riemannian case and H = O(1,n— 1) in the Lorentzian case. 


This follows by the argument in the proof of Lemma A.10, which applies because constant 
curvature implies Riem = k- S, see Proposition 4.7 and especially eq. (4.86). Any element 
F € O(n) or F € O(1,n— 1) preserves the inner product, and hence the metric, and hence, by 
the above formula, the Riemann tensor. Thus F comes from an isometry f, i.e. F € H. 


We now know that M = G/H as a homogeneous Riemannian or Lorentzian manifold, where 


G = Iso(M, g8); (A.74) 
H = O(n) or O(l,n-1). (A.75) 


660 This argument also leads to a proof of the claim below Definition 4.6 to the effect that if M is connected, 
dim(M) > 3, and C,(X,Y) is independent of X and Y for each x, then this common value is also independent of x. 
Indeed, in d > 3 we may take Z=U to be unit vectors and (X,Y,Z) mutually perpendicular, so that (A.73) yields 
(Xk) -Y — (Yk) -X =0. Since this is true for all X L Y, it follows that Xk = Yk = 0, and hence k is constant. 
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Since O(n) and O(1,n—1) act irreducibly on R”, so that Ad’(H) is irreducible on g/b, by 
Proposition A.6 there is exactly one possible G-invariant metric g on G/H (up to scaling). 
We now transfer the involutions ly on M to G. Since for all x € M and y € Iso(M, g) one has 


WY! = ly, (A.76) 
it is sufficient to consider a single ly : M — M, where x’ € M is arbitrary. For (A.74), define 


1:G—G; (A.77) 
Vo Lr Yle. (A.78) 


Using (A.76) and the definition of the maps lx, it is easy to show that / has the properties 
l £ idg; I? = idg; I(yö) = I(y)I(8). (A.79) 


We defined / by (A.78) for (A.74) - (A.75), in which context (A.79) follow from the definition. 
Conversely, for any Lie group G one may start with a nontrivial smooth involutive automorphism 
(A.77), i.e. a map (A.77) satisfying (A.79), called a Cartan involution on G, and define 


H:=G'={yeG|i(y)=y} (A.80) 

as the fixed-point set of /. Then construct a family (lx)xcg/y of diffeomorphisms of G/H by 
la (YH) := I(Y)H; (A.81) 
lyn (x) = Y-lu(y x). (A.82) 
If H is connected, these procedures are equivalent; if H is disconnected, then Gh cHcd!! 
Thus one may start either with a symmetric space(M,g) or with the corresponding (Lie) group- 
theoretical data (G,/). Up to issues with connectedness, which have to be dealt with by hand, 


these group-theoretical data can in turn be replaced by algebraic data, to which we now turn. 
Since / : G + G is smooth, it has a derivative l’ : g — g, defined by, cf. (A.16), 


1 = d tA 
(A) == (e Ja (A.83) 


As in (A.55), this map satisfies exp(!’(A)) = I(exp(A)). From this, and I? = idg, we compute 


d 


I y _d tl'(A) d A = 
ol'(A) = f(e = <1 (I(exp(tA))),9 = 5, (e ) Z4 (A.84) 


so that (7 in = idg. We therefore have our promised canonical decomposition (A.42), in which b 
and p are the eigenspaces of l’ with eigenvalue 1 and —1, respectively. Furthermore, it follows 
from the last entry in (A.79) that l’ is a Lie algebra automorphism, i.e., l’ is linear and 


I'((A,B]) = [K (A), t (B)]. (A.85) 
This implies the following properties (of which the first one is trivial since H C G is a subgroup): 


[b,b] C b; [h p] C p; [p.p] C b. (A.86) 


661 See Helgason (1978) or even wikipedia, symmetric space, which entry is excellent. 
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We return to the proof of Corollary 4.11 (our classification problem). Proposition A.6 and A.74 - 
(A.75) apply, as well as the remarks preceding Lemma A.13. Consequently, 


p= R”, (A.87) 


and the Ad’(H)-action on p is the defining action of H = O(n) or H = O(1,n—1) on R”. By 
(A.56), the derivative of the Ad’(H)-action is the ad(h)-action, i.e. for A € h and v € p we have 


[A,v] =A-y, (A.88) 


where A - v is the derivative of the defining action of Ho, see (A.54). Since the Lie bracket [A, B] 
for A,B € h is also known (because h = o(n) or b = 0(1,n—1)), by (A.86) all we need to find 
out to determine g as a Lie algebra (and hence, by Lie’s third theorem,° to determine G as a 
Lie group) is the commutator [u,v] € h of u,v € p = IR". For n = 2 the only known unknown is 


[71,7] = pT, (A.89) 
for some constant p € R, where (Ti, To) is some basis of R? and 


T3 = B H= O(n); (A.90) 
T; = k3 H =0(1,1), (A.91) 


see (A.31) and (A.33), respectively. Rescaling the metric by a positive constant then restricts us 
to p = 1,0,—1. For H = O(2) this leaves us with the following list: 


p=1 [71.7] = Ts; Be; B, h] = -Tı; (A.92) 
p=0: [T1, P] = 0; [73,11] = Ta; [3,7] = -N1; (A.93) 
p= —1: Ti, T] = —T;; [73,7}] = T; [T3, T] = —T}. (A.94) 


These are the Lie algebras of O(3), E(2), and O(1,2), respectively, cf. (A.21), (A.32), and 
(A.26). Thus we find the homogeneous spaces 


O(3)/O(2) S $°; p=1, (A.95) 
E(2)/0(2) = R’; p =0, (A.96) 
0(1,2)/0(2) = H’; p=-l, (A.97) 


see the left-hand sides of (A.45) - (A.47) in $A.3. 
For H = O(1,1) each third bracket changes sign (k3 versus j3), and hence we obtain 


p=! [71,72] = Ts; [73,71] = Ta; B,D] =T; (A.98) 
p=0: [T1,T>] = 0; [73,1] = Ta; B,D] =T; (A.99) 
p= —1: [7,7] = —T3; [73,T}] = Tə; T3, T] — T. (A.100) 


662] et g be a Lie algebra. There exists a simply connected Lie group G, unique up to isomorphism, such that the 
Lie algebra of G is g (and any Lie group isomorophic to G has a Lie algebra ismomorphic to g). Furthermore, if G 
is a connected Lie group with Lie algebra isomorphic to g, then G= G/D, where D is a discrete normal subgroup 
of the center of G. This called Lie’s third theorem (first proved by Cartan). See e.g. Duistermaat & Kolk (2000). 
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Now we have the Lie algebras of O(1,2) (bis), P(2), and O(2,1), see (A.23), (A.34), and (A.28), 
respectively, and therewith, the homogeneous spaces 


O(1,2)/O(1,1) S ds’; p=1, (A.101) 
P(2)/O(1,1) = R’; p=0, (A.102) 
O(2,1)/O(1,1) S Ads”; p=-l, (A.103) 


see the right-hand sides of (A.45) - (A.47) in §A.3. 
We still need to compute the (constant) curvature of these spaces. 


Lemma A.14 For any symmetric space M = G/H, the Riemann tensor at H € G/H is given 
by 
Riemy(X,Y,X,Y) = —gn(|[X,Y],Y],X), (A.104) 


where gy is the metric at H € G/H (which determines the metric on G/A, cf. Proposition A.6). 


Proof. This formula follows from (4.12) and the Koszul formula (3.54) for the covariant 
derivative of the Levi-Civita connection.° It is enough to verify (A.104) for orthonormal basis 
vectors X = 7, and Y = Tp, which come from a basis of p, as explained in the main text for 
n= 2. In (3.54) only the last three (commutator) terms are nonzero, whilst in (4.12) only the 
term —g(X,Vjy y])Y ) contributes; the others all involve commutators taking values in h, which 
give vectors that vanish at H € G/H. This gives 


Rabab = —3(8H([[Ta, To], To), Ta) + 8( [Tas [Tas To]], To)). (A.105) 
These two terms are equal because of ad(h)-invariance of gy, which gives 
en((X,Y],Z) +2(Y,[X,Z]) = 0 (A.106) 
for any Y,Z € p and X € b; use this with X = |Ta, Tp], Y = Ta, and Tp. This gives 


Rabab = —guq ( |[Ta, To], To], Ta), (A.107) 


which is (A.104). 
By H-invariance, in an orthonormal basis gy must be the Euclidean or the Minkowski metric. 
In the former case, for n = 2, the orthonormal basis (u1,u2) of Ty(G/H) = R? may be either 
taken to be (7), T2) for any p, or, for p = 1 and hence O(3), the geometrically more natural basis 


discussed in §A.3, i.e., (Jo, —J\) (for the other two cases (7), T2) was also the natural basis). 
Either way, the Lie brackets (A.92) - (A.94) or (A.21), (A.32), and (A.26) give 


Rı2ıa = p. (A.108) 
By (4.47) this also gives the sectional curvature, so that, so far in the Riemannian case, 


Eq. (A.104) is also valid in the Lorentzian case, but here one must be more careful about the 
choice of the basis (uo, u1) in 2d Minkowski space (R?,n), with n = diag(—1,1). 
We now study the three cases separately. 


663 See Kobayashi & Nomizu (1969), chapter XI, Theorems 3.2 and 3.3. 


332 Lie groups, Lie algebras, and constant curvature 


e As explained in §A.3, for p = 1, i.e., G = O(1,2), we take up = —e2 and uy = -eı. 
Eqs. (A.104) and (A.24) with n (e2,e2) = Noo = —1 then give Roıoı = —1. Since the 
sectional curvature picks up a minus sign because of the denominator in (4.47), which in 
the Riemannian case equals +1 in an orthonormal basis but in the Lorentzian case equals 
—1, this gives k = 1. 


e Similarly, for p = —1 and hence G = O(2,1), with basis ug = — fz and up = — fı of 
Minkowski R?, eqs. (A.104) and (A.28) give Roıoı = 1 and hence k = —1. 


e Finally, for p = 0 we obtain Rojo; = 0 because in (4.12) the commutator vanishes: 


[X,Y] = [7,7] =0. (A.110) 


Hence we have k = p in all six cases. 
This proves Corollary 4.11 for n = 2. As a bridge to the general case n > 2, we note that 


[u,v|w = p((u,w)v — (v,w)u), (A.111) 


where the inner product is either the Euclidean or the Minkowski one, as the case requires. 
This follows from linear extension of (A.89) and hence has been derived for n = 2 only. But 
(A.111) holds in any dimension! To see this, we note that the adjoint action of H = O(n) or 
H = O(1,n—1) on g consists of Lie algebra automorphisms (as this is true for all of G). Hence 


[ku, kv] = Ad(k)([u,v]) = klu,v|k"!, (A.112) 


for any k € H and u,v € p = R”, with [u,v] € so(n). If n > 2, we may take three mutually 
orthogonal vectors u,v,w and take k to be the reflection in the (hyper)plane orthogonal to w. 
Then by construction we have 


ku=u; kv =v; k`!w = kw = —w, (A.113) 

so that (A.112) gives 
[u,v|w = —k([u,v]w). (A.114) 
By definition of k (which implies that kx = —x is only true if x is a multiple of w), this implies 


that [u,v]w is a multiple of w, which is impossible unless [u,v]w = 0. Therefore, [u,v] maps any 
vector orthogonal to u and v to zero, which yields (A.111) for any n. The covariance property 
(A.112) has not only delivered the conclusion just given, but it also implies that the constant p in 
(A.111) is independent of the u-v plane (since H can move any plane to any other plane). 

Given (A.111), the Lie algebra g is now entirely known.°°* What remains is to find the right 
basis of g for the three cases p = 1,0,—1, and thus recover the Lie algebras of 


O(n+1); E(n); O(n, 1) (A.115) 
in the Euclidean case, and in the Minkowski case, of 


O(1,n); P(n); O(2,n—1). (A.116) 


Once again using (A.104) to show that k = p, this finishes the proof of Corollary 4.11. 


664The next step is best done in a basis provided by the root space decomposition of semi-simple Lie algebras, 
which requires more background than this appendix offers. Helgason (1978) is a complete reference for this. 
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This appendix collects some background for the study of the (gauged) Einstein equations as 
quasi-linear second-order hyperbolic PDEs. This field is huge and we just describe what we need 
for chapter 7. To start, all modern (i.e. post 1945) PDE theory is based on distributions. 


B.1 Distributions and Sobolev spaces on manifolds 


This section collects some basic fact, more or less in staccato style, and without proofs.°™ 


1. Notation. Letn > 0 and x € IR". It is convenient to write x = (x1,...,x,) rather than our 
usual (x!,...,x”). Let œ = (Q,...Q,), with &; € N (where 0 € IN). We abbreviate 


i=l 
d Qa d On glal 
el aed —- 93%... Jn = _ =e’ 
DY = (>) (2) = 0; Or = gx” o. x% > (B.2) 
x” — i = x, Ba 


2. Test functions. For each measurable (usually open) subset Q C R”, let 2(Q) be C(O) 
as a set, equipped with the topology in which fa — f iff there is a compact set K C Q 
such that supp( f} ) C K for all A, and for all multi-indices & one has 


||D° (fa Ale + 0. (B.4) 


This implies supp(f) C K also for the limit function. This may be generalized to manifolds 
M, as follows. For some given atlas (U;, @;) we say that f} > fin 2(M) = C7 (M) iff 
for each y; € C2(U;) and all multi-indices & one has 

IDE WR = F) 00; "lee = 0. (B.5) 


This turns out to be independent of the choice of the atlas. Elements of A(IR"), (QO), or 
Q(M) are all called test functions. 


A rapidly decreasing (test) function f € (IR") is a function f € C”(IR”) for which the 
function x > x“DP f is bounded for all multi-indices œ and B. One often writes 


GaSe ly, (B.6) 


and uses x +> (x)%DP f, which of course gives the same space. The topology on . (IR") is 
such that f} — f iff for all /,m € IN and multi-indices & and B with |a| < l and |B| < m, 


DAN» > 0. (B.7) 


3. Distributions on Q are elements of the space Y’(Q) of all continuous maps u: F(O.) > C. 


A linear map u: Z(Q) — C is continuous in the topology just defined iff for each compact 
K C QO there is m € N and C > 0 such that for all œ with |a@| < m, 


N = (Ns CIID” Flle. (B.8) 


665 For details see for example Hörmander (1990), §6.3, Taylor (1996), $4.3, and Grubb (2009), passim. 
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For example, a distribution of order zero (i.e. m = 0 is just a (Radon) measure on Q. 


The space 2'(Q) carries the weak topology, in which uy — u iff (ug, f) > (u, f) for 
each f € (QO). In this topology, A(O) is dense in 2’(O), where u € 2 (Q) defines 
u € Y'(Q) through the Z? inner product, i.e., (u, f) = (T, f)12(0). Adding a middle man 
gives a Gelfand triple, in which each embedding is continuous and dense: 


YO) Cr (Q) c 209). (B.9) 


Likewise for Z(M), provided we equip our manifold M with a measure that in coordinates 
has the same null sets as Lebesgue measure.°°° For example, any (background) Riemannian 
metric on M provides such a measure (7.10). Also in that case we obtain a Gelfand triple 


Q(M) CL?(M) C 9'(M). (B.10) 


Tempered distributions on R” are continuous linear maps u : Z (IR"”) > C. The (weak) 
topology on the ensuing space .Z’(IR") defines convergence uy — u of nets iff there are 
l,m € N and C > 0 such that for all œ with |a@| < and B with |B| < m one has 


Ku, < Cllx*D? Flle (B.11) 

Similarly to (B.9), one has a Gelfand triple (i.e. the embeddings are continuous and dense) 
FP(R”) Ci BEL) (B.12) 

and since A(IR") c Z (IR") continuously, and hence .7’(IR") C Y’(IR"), this extends to 


IR") c F(R”) c LR”) C.A'(R") c 2'(R”). (B.13) 


. Weak derivatives. It will be convenient from now on to write, whenever convenient, (u, f) 


for u( f). For each multi-index a, the weak derivative D“u of u € 2’(R") is defined by 
(Deu, f) = (—1)l*l (u, D® f}. (B.14) 


This definition comes from the fake formula (u, f) = fpa d”xu(x) f(x), which on repeated 
partial integration would give (B.14). Any linear partial differential operator may therefore 
be regarded as a map L : 2'(R”) > 2'(R”), with adjoint L* : 2 (R”) > 2(R”), i.e., 


(Lu, f) = (u, L* f). (B.15) 


For example, if L = D“, then L* = (-1)!“D“. The derivatives in Lu are called weak, 
those in L* f being classical. Similarly, a solution u € Z’(IR”) of a linear PDE Lu = F 
(with initial conditions), i.e. (Lu, f) = (u, L* f) for all f € 2(R”), is called weak. 

The definition (B.14) also applies to u € 2'(R”), at least if Q is open in R”,6® as well as 
to 2’(M), provided M has no boundary (which indeed is our standing assumption). 


666 Hörmander’s definition of a distribution on M coincides with the one above if we choose such a measure. 

667Be careful with (B.15) if Q is not open. For example, if Q = [0,cc) x R” and L = -0O = oP — A, then (due to 
boundary terms in partial integration) the inhomogeneous wave equation Lu = F with initial conditions u(0,x) = f 
and u(0,x) = g(x) becomes — fp dt fgn d'xuOf = fo dt fgn d"xF f + fignd"xg(x)f(0,x) — f(x) f(0,x). 
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5. Sobolev spaces. For any s € IN, based on (B.9), define the Sobolev space 
H’(O) := {u € (Q) | Du € I?(O)Vva:ja|<s}, (B.16) 


where accordingly the derivatives inherent in D% are weak. Clearly, H°(Q.) = L? (Q), but 
it can be shown that all H*() are Hilbert spaces with respect to the inner product 


(uv)s:= ), (D%u,D%v), (B.17) 


lalss 


where Pjaj<s Means Yy.jaj<s; and (-,-) is the inner product in L?(Q). Note the danger 
of ambiguous notation here: (-,-), often denotes the inner product in LP, but here (-,-), 
stands for the inner product in H5; in our notation the inner product in Z? would be (-,-)o. 


For Q = R” a different perspective on Sobolev spaces comes from the Fourier transform 


A 


f($) := (2m) f O (B.18) 
F(x) = Qn [a f (Ee, (B.19) 


which make sense as Lebesgue integrals for f € L!(IR"). If one also has f € L'(IR”), then 


x 
X 


Ter (B.20) 
The scope of these formulae may be extended in at least three different ways:°°° 

(a) Eq. (B.18) yields a unitary isomorphism L2(IR") > L?(IR”) of Hilbert spaces. 

(b) The Fourier transform also defines a linear homeomorphism ./ (IR”) 3.I (R”). 


(c) Defining f for f € .Z'(R") by (f, f} = (f, F), the Fourier transform (B.18) even 
defines a linear homeomorphism 2’ (R”) + .7'(IR”) of tempered distributions. 


Returning to Sobolev spaces, for Q = IR” may now (re)define, for any s E€ R, 
HS(R") := {u € .'(IR") | Er (E)'ü(E) € L?(R”)}, (B.21) 


with inner product 


(ur)si= fare (E)H(E)0(E) = | arEQ+IEIP)A(E)E) 2) 


For s € N this reproduces (B.16) as a vector space (a fact that is not obvious), but the 
inner products (B.17) and (B.22) are different. Although they induce equivalent norms, 
for s € N one has to specify which one is used. Either way, we have: 


668]f one equips C7 (IR”) with the unusual norm || f||o = max {|| flle, || fllo}, with associated completion denoted by 
Co (IR"), then (B.18) yields an isometric isomorphism C% (R”) = Co (IR") as Banach spaces. For C*-algebra experts 
we note that the Fourier transform also yields an isomorphism C* (IR") = Co (IR”) of commutative C*-algebras 
(here C*(IR") is the completion of C? (R”) in the operator norm obtained by letting f € C? (IR”) act on L? (R”) by 
convolution, whereas Co(R”") carries the supremum-norm). In this case (which follows from the Riemann—Lebesgue 
lemma) the Fourier transform is a special case of the Gelfand transform. See Landsman (2017), §C.15. 
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Theorem B.1 1. Sobolev embedding theorem: For m > 0 and s > m+ In, one has 


HRJ CCR). (B.23) 


where the embedding is continuous with respect to the norm |\u||m = Ljaj<m ||D*ull«- 


. Sobolev duality theorem: For any s € R one has 


H’(R")* = H(R"), (B.24) 


i.e. A € H*(IR")* linearly, bijectively, and isometrically corresponds to f € H~*(IR") via 


A(u)= | ax flajuG)= (fou). (B.25) 


R” 


. For s > 0 we have our third Gelfand triple 


PAR er (Rae H (R”), (B.26) 
which analogously to (B.13) may be extended to a “Gelfand quintuple” 


AR C H'(R”) C L?(R”) © (R c Z (R”). (B.27) 


. Sobolev spaces can also be defined on manifolds. For u € 2’(M), we define u € H? (M) iff 


for each chart (U;,@;) and x; € C?(V;), where V; = @;(U;) C R”, the distribution uo o! Xi 
on Y(IR"), defined on f € A(IR") by (uo go; 'x,,.f) = (u, (xif) © i), is in H°(R"). 


Theorem B.2 Let M be a compact Riemannian manifold. 


1. 
2. 


For each s € R the space D(M) is dense in H*(M). 


For each s € R we have an isometric (Banach space) isomorphism 
HM) =H *(M), (B.28) 


understood in the following way:°° any continuous functional A € H" (M)* corresponds 
linearly, bijectively, and isometrically to f € H~*(M) via 


A(u) = (f,4) rm) (B.29) 


. Sobolev embedding theorem: Zf s > In+k, then H’(M) C CX(M), where the embedding 


is continuous with respect to the norm ||u||m. = Lja|<m | D*u||» on C*(M). 


. Rellich theorem: For s € R and 6 > 0, the injection H*+°(M) > H°(M) is compact. 


. For s > 0 we have our final Gelfand triple, cf. (B.26), 


H’(M) CL?(M) CH (M). (B.30) 
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B.2 Linear wave equations 


For the PDEs of interest to GR, IR” will be space, and time needs to be treated separately. Typically, 
for fixed time T > 0 one considers Banach spaces like C([0,7],#*(IR")), with norm 


lullo = sup |lu(s)|ls, (B.31) 
te[0,T] 


or C!(|0,7],#°(IR")) with analogous norm, or ZP([0,7],#°(R")), 1 < p < ©, normed by 


1/p 
pdp = (ff nor) 832) 


or L*([0,7],#°(IR”)), with norm 
Jull =esssup,cprjllule)lis (8.33) 


Here we define L?(|0,7],H*(IR")), 1 < p < ©, as the completion of C({0,7],H*(IR”)) in 
the norm (B.32), and also (avoiding Banach space-valued measurable functions), define the 
space L*([0,T],H°(IR”)) as the (Banach) dual of L'([0,7],H~*(IR")), in that we identify 
f € L*([0,T],H°(R”)) with the functional A, € (L'([0,7],H~*(IR")))* given by, cf. (B.29), 


Ape) =f aris). B34 


To see such spaces in action, we consider the free wave equation on R"t! i.e. 
(—07 + A)u=F; u(0,x) = f; u(0,x) = g(x). (B.35) 


For F = 0 and n = 1,3, the (unique) solution (known since the 18th century) is 
x+t 
u(t,x) = 4 (149 — f(x-1t) +f dyeb) ; (n=1); (B.36) 
1 3 3 
uta) =a do*(y) (te) + £0) - LaFO)i-y) (m=3). B37 
Amt” J\y—x|=1 i=1 


From this, we see that in n = 1 the solution at (t,x) only depends on initial data within its causal 
past J~ (x,t), intersected with the Cauchy surface & = {(x° = 0,x),x € R”}. Indeed, recall the 
causal past J~ (t,x), emanating from (t,x), and its boundary E7 (t,x), i.e. the past lightcone, 


(t, 


J- (t,x) = (yy) € R+! |y? =a" |S |y — xl, y? < 8}; (B.38) 
E~ (t,x) = {0°,y) € IR", y? —2| = [y -xl y? <a}, (B.39) 


cf. (5.90) - (5.91) with y? > x° replaced by y? < x" (as well as x by (t,x), etc.). Inn = 1, 
ENJ (x,t) = {(y° =0,y),y € [x -t,x +t]}. (B.40) 
In n = 3 the solution u(t,x) even depends on the initial data at 2M E~ (x,t) only, since 


ENE (t,x) ={(y” =0,y), a= (B.41) 


669 Also, H’(M)* = H°(M) through its own inner product; the pairing in (B.25) is through the Z? inner product. 
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An analogous phenomenon holds in the inhomogeneous case F Æ 0, in which case the solution 


l 3 F(s,y) 
u(t,x) = TE heat o(s,y) Gere) (B.42) 
for zero initial data for simplicity, only depends on the values of F at the past lightcone E~ (t,x). 
In other words, F (s,y) only influences u along the forward lightcone emanating from (s,y). The 
situation in n = 3 (and also in all higher odd spatial dimensions), in which both initial data f, g 
and the inhomogeneous term F affect the solution only along future light rays is called the 
strong Huygens principle. The (ordinary) Huygens principle, then, formalizes the situation in 
n = 1,2, and all higher even dimensions, in which the entire causal future of (s,y) affects the 
solution-or, equivalently, u(t,x) only depends on data within its causal past. 
An explicit solution for any F, f, and g may be written down using the Fourier transform: 


aé) = cost EAE) er [as ED eng), way 


as the notation indicates, the formula (B.18) is only applied to the x-variable, and, within the 
function classes to be discussed, the actual solution u(t,x) may be (re)constructed from (B.19). 
Although the space-time and causal structure of the solution is not at all obvious from this 
formula, the advantage is that (B.43) easily implies an energy inequality: for any s € Z, 


T 
else ee (Ulen +lelo+ f dere). (B.44) 


where 0 < T < œ, provided that F € L'({0,T],H°(IR”)), f € H’*!(R"), and ge H°(R"), so 
that the right-hand side makes sense. The proof is an exercise, using the fact that (B.22) implies 


D= f EHER. (B.45) 


Corollary B.3 For any T > 0 and s € Z, the free wave equation (B.35) with initial conditions 
f € H’*!(R") and g € H’(R"), and F € L'((0,T],H°(IR”)), has a unique solution 


u(t,x) € C([0,T], H+! (R”)) nC! (J0, T], A? IR”): (B.46) 


Uniqueness follows either from the derivation of the explicit solution (B.43) from the 
initial data, or from (B.44): if uy and u2 both solve (B.35), then u = uy; — u2 solves (B.35) for 
F = f = g = 0, so that the right-hand side and hence the left-hand side of (B.44) vanishes, etc. 

We now turn to linear wave equations of the form Lu = F with initial data (B.35), and 


L= gP°(t,x)0pdg + bP (t,x) pp + alt,x). (B.47) 


Since we don’t have an explicit solution, the derivation of a suitable energy inequality (to be 
used as a lemma for proving existence, uniqueness, and analytic properties of solutions) will 
have to be a priori.’ A particularly useful energy inequality for the operator (B.47) is 


Y ||D%ule heir F ||D*u(0,-)||s + f deals) (B.48) 


lal<ı lal<ı 


670These a priori derivations are straightforward but very lengthy, and therefore we simply state the results without 
derivation; for (B.48) see Sogge (2008), $I.3 and Luk (undated), §4. See also Ringström (2009) for similar estimates. 
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This inequality is valid for any 0 < t < T < œ, s € Z, and u such that (B.46) holds,°’! with 

Lu € L!([0,T],H°). It immediately gives uniqueness by the same argument as for the free wave 

equation, but existence and regularity require a more advanced, functional-analytic argument.°’? 
In order to explain the reasoning, let us first take a simpler situation. For O C R”, let 


L:9(Q) +90) (B.49) 


be a linear operator, e.g. as in (B.47), with adjoint L* : 2(Q) + 2(O) defined by (B.15). As 
already mentioned, the PDE Lu = F (with zero initial conditions for simplicity) then means 


(u,L"f) = (F, f) (B.50) 
for all f € 2(Q). Throughout the argument, we must assume that, for any net (fa) in 2 (Q), 
Peter (B.51) 


If L* is a bijection, and F € 2'(Q), which is the very least regularity to impose, then we are 
done at the coarsest level of proving existence and uniqueness of a solution u € Y’(O), since its 
value at y € (QO) is given by finding the unique f € (QO) for which y = L* f, and putting 


(u, y) = (F, f). (B.52) 


The assumption (B.51) then implies that if y} —> y, i.e., L* f} > L*f, then fi > f, and 
hence (F, fr) > (F, f) since F € 2'(Q) by assumption, and hence (u, y) — (u, Y}, since 
(u, Wi.) = (F, fa). Thus u is a continuous linear functional on 2 (Q) and hence u € 2' (O). 

If L*, still assumed to be injective, merely has dense range ran(L*) C 2 (Q), then one still 
has existence and uniqueness of u, since for y € ran(L*) eq. (B.52) continues to apply, whereas 
for y outside the range of L* we may write y = lim, L* f} and then (u, w) = lim, (F, fi): 

Finally, if L*, still injective, does not have dense range, the Hahn-Banach theorem (for locally 
convex vector spaces) yields existence of u by extending the solution u : ran (L*) + C constructed 
above to a continuous linear map u: 2' (Q) > C, but one loses uniqueness. Fortunately, in many 
applications to PDEs uniqueness still follows from suitable energy inequalities. 

Such inequalities also play a central role in refining the above argument. Suppose one has 
two Gelfand(ish) triples 2 (Q) CW c 2'(Q) and 2(Q) c Z c 2'(Q), where W and Z are 
Banach spaces and all inclusion maps are continuous with dense image, and suppose that 


Ifllz < CIIL* flw Yf E 2(9)). (B.53) 


This ‘energy condition’ supersedes the continuity assumption (B.51) within 2 (Q), and is also 
more powerful in that it clearly implies that L is injective, which is an essential condition for the 
whole analysis to apply in the first place. Furthermore, the inequality (B.53) implies: 


Provided L* is injective, for any F € Z* there is a solution u € W* to Lu = F. 


Note that 2 (Q) c Z implies Z* C 2'(Q), and similarly 2 (Q) c W implies W* c 2' (Q). 
Compared with the earlier argument where the assumption F € 2’ (Q) gave a solution u € 2' (Q), 


671 Moreover, the derivation requires that g!” (t,x), bY (t,x), and a(t,x) be C” with uniform bounds on all 
derivatives, where (t,x) € [0, T] x IR", as well as Xyu nu |8” (t,x) — n#”| < 4, where n is the Minkowski metric. 
672The following arguments are adapted from Vasy (2015), chapter 17. The entire book is very useful. 
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we have now strengthened the assumption to F € Z* C Y'(Q), and, given (B.53), accordingly 
strengthened the conclusion u € Y’(O) tou € W* C 2' (Q). Indeed, noting that 


ran(L*) c 9(Q) CW, (B.54) 
let y € ran(L*), so y = L* f, and define a linear map u : W —C initially on ran(L*) C W by 
(u,L* f)we—w = (F, f)z—-z- (B.55) 


Because of (B.53), if L* f — L* f in W, then f} — f in Z, and hence on the assumption F € Z*, 
the functional u defined by (B.55) is continuous on ran(L*) in the (norm) topology of W. Once 
again, the Hahn—Banach extension theorem (but this time simply for Banach spaces) gives a 
continuous extension u : W — C, i.e. u € W*, as claimed. 

We now show how the energy estimate (B.48) implies an estimate à la (B.53). For any T > 0, 
we replace u in (B.48) by f € CZ ((0,T) x R”), which certainly satisfies the assumptions vali- 
dating (B.48), and replace L by L*. Then D“u(0,-) is replaced by D% f (0, -) = 0. Furthermore, 
for any multi-index a, s € R, k € N, and f € A’, by definition of the Sobolev spaces we have 


IfII-se „ IDS (B.56) 


|a|<k 


With k = 1, also using the trivial estimate [jdt g(t) < f£ dtg(t) for 0 <t <T and g(t) >0, 
in this case with g(T) = ||L* f(t, -)||_-s—1, we find, for any s € Z and f € CZ((0,%) x R”), 


T 
esse fate Is (B.57) 
This is a special case of (B.53), with 
W = L! ([0, T], H71 (IR")); Z = C([0, T], H: (R”)); (B.58) 
W* = L” ([0,T], H5! (R”)); Z* > L! ([0,T], H"(R*)). (B.59) 


The precise form of Z* (which is the space of bounded measures on [0,7] taking values in H°’) is 
not needed here. Assuming zero initial conditions for the moment, the abstract argument above 
gives a solution u € L®([0,7],H°*!(IR”)) for F € L!([0,T],#°(IR")), which, by the original 
energy inequality (B.48) is also unique. More advanced arguments involving elliptic regularity 
further push the solution into (B.46).°’° Finally, the case of nonzero initial data f,g can be 
reduced to the case f = g = 0 by a standard trick. For given F, let v solve Lv = F for zero initial 
data. Define w(t,x) = f(x) +tg(x). Then u = v + w solves Lu = F for given f,g. Thus: 


Theorem B.4 For any T > 0, let L be defined by (B.47), including all assumption stated 
afterwards. For any s € Z, the linear wave equation Lu = F, with F € L'({0,T],H°(IR")) and 
initial conditions f € H’*!(IR") and g € H*(IR"), see (B.35), has a unique solution 

u(t,x) COO TLA (IR")) NC! (0, 7], 75 (R")). (B.60) 
The Sobolev embedding theorem (B.23) then pushes this into the smooth realm: 
Corollary B.5 In the setting of the previous theorem, if F, f, and g are smooth, then so is u. 
One can also show that the causal properties of the solution relative to F and the initial data f,g 


are the same as for the free wave equation, except that the strong Huygens principle need not 
apply. But the ‘ordinary’ one, implying causal propagation of initial data and F, always does. 


673 See Sogge (2008), p. 20. 
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B.3 Quasi-linear wave equations 


In either the (naive) wave gauge or its refinement the g-wave gauge, the vacuum Einstein 
equations (7.121) have the abstract form Lu = F, where u = gyy and Lis like (B.47), with the 
difference that in L = g?°(u)dpdg the coefficient of the highest (i.e. second) order derivative 
now depends on u, and furthermore F = F (u, Ou) depends on u and du. Such equations (in the 
more general case that g and F may depend on u, du, and even (t,x)) are called quasi-linear, 
and if the signature of g is Lorentzian, as we of course assume, the PDE is hyperbolic.°’* 

We assume for the moment that u takes values in R; the generalization to u = (guy ) taking 
values in IR!°, is straightforward and will be outlined shortly. It is also sufficient for basic 
applications to GR to assume that g?° : R — R is smooth, as is F : R x R"*! — R. So we study 


gP° (u)dpdgu = F(u,du). (B.61) 


As opposed to truly nonlinear hyperbolic PDEs, the quasi-linear case is relatively easy because 
it can be solved by reduction to the linear case. One can only feel fortunate that the Einstein 
equations (at least in a suitable gauge) fall into this category. Here is the basic result:°”° 


Theorem B.6 Let F be smooth and g?° smooth and not too far from the Minkowski metric.°’® 
1. With f :=u(0,-) € H°*!(IR") and g := u(0,-) € H°(IR"), eq. (B.61) has a unique solution 
u € L” ([0,T], H5! (R”)); ù € L” ([0,T], H (R”)), (B.62) 
provided s > in. Here T is either arbitrary (as in the linear case), or there exists 
Ts = T.(Flls+1llglls) (B.63) 
such that \|D%u||.. = œ on |0, T4] x R”, for some |a| < 2. 


2. This u depends continuously on the initial data, i.e. if fy, > f in H+! (R") and g, > g in 
HS(R”), then u, — u in L” ([0, T], HSH! (IR")) with ü, — u in L” (J0, T], H: (R”)). 


3. If f € C2(IR") and g € CZ (R”), then u € C” ([0,T] x R”), cf. Corollary B.5. 
Eq. (B.61) is solved using a generalization of the Picard iteration procedure.°’® Take 
uo(x) = f(x) = u(0,x), (B.64) 
and iteratively define ug+1 as the solution to the inhomogeneous linear PDE 


gP° (ur)Ipdour+ı = F (ug, dur), (B.65) 


674n fluid mechanics all these dependencies also occur, see e.g. Taylor (1996), chapter 16. 

675 See Sogge (2008), $I.4, Luk (undated), $6, Choquet-Bruhat (2009), App. III, or Ringström (2009), chapter 9. 

676 Think of Epo llg?? — n°? ljo < 5, as in Sogge (2008). For initial data f € H’*!(R") and g € H’(R"), one 
can make further (contrived) regularity assumptions on gP° and F that push u into (B.60). See Ringström, Ch. 9. 

67’For an example with T, < œ, take (9? — A)u = u? with u(0,x) = u(0,x) = 1 (times a cutoff function), so that 
u(t,x) = 1/(1-1) (for small x), and hence T* = 1. 

678Recall that an ODE u’ (t) = f (t,u(t)) with initial condition u(0) = uo, which is equivalent to the integral equa- 
tion u(t) = uo + Jods f(s,u(s)), may be solved by iteration from uo(t) = uo and ug+ı(t) = uo + Jods f (s,ug(s)). 
For suitably regular f, this sequence (ug) uniformly converges to a solution u on some interval [0, T]. 
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subject to the initial conditions ugz+1(0,x) = f(x) and u. 1(0,x) = g(x), as for u itself.°’? For 
given u;,(t,x), eq. (B.65) is the type of PDE studied in the previous section. Hence Theorem B.4 
guarantees a solution for any T > 0, but convergence of the iteration and uniformity of the energy 
inequality (B.44) in k, gives a less regular solution in Theorem B.6 compared to the linear case. 
Theorem B.6 applies to the Einstein equations in the (g-) wave gauge, except that: 


e Instead of a single unknown u we now have 10 unknowns guy, with one equation for each 
(but the ensuing system is coupled, since g?° is a function of all guy and so is F(g,0g)). 


e The Cauchy surface {t = 0} c R"*! is replaced by a 3d (Riemannian) manifold I. 
e The initial data u(0,-) = f and ü(0,-) = g are replaced by the Cauchy data (8,k on È. 


e Using either local coordinate patches and a partition of unity, or a background metric 7 
on ÈŁ making the construction coordinate-independent (like the g-wave gauge), one can 
define Sobolev spaces H*(X) for any se R (in view of s < In+ 1 in Theorem B.6, s € N 
is enough).°*° This construction may be extended from functions on È to arbitrary tensors 
t € X!) (£), yielding Sobolev spaces Hin): Thus one may say, e.g., k € Hi o) (Sa): 


e The PDE (B.65) is replaced by the reduced (vacuum) Einstein equations (7.121). 


This eventually leads to Theorem 7.16 in §7.6 and its localization Proposition 7.17. Much as 
uniqueness is proved from an energy inequality, the localized uniqueness of the above kind is 
proved from a localized energy inequality. We merely explain this for the free wave equation 
u = 0 in R"*!, but the principle is the same also in Lorentzian geometry.°®! 

For any 0 < t < R, (t,x) € R"*!, and (reasonable) function u(t,x), define 


E(t,x,R) = EI d"y[ü(t,y)? + Vu(t,y) - Vu(t,y)]. (B.66) 


This is just the energy of u, restricted to the ball B(x;R—t) CIR". If Ou = 0, then 
0<s<t > 0<E(t,x,R) < E(s,x,R). (B.67) 


That is, tHE (t,x,R) is monotonically non-increasing. Fix R > 0, and note that 


ER) = 3 | po T80 EVO): VIO) (B.68) 
Eq. (B.68) implies that if f(y) = g(y) = 0 for all y such that |y — x| < R, then E(0,x,R) = 0, 
and hence E(t,x,R) = 0 for all 0 < t < R by (B.67), and hence u(t,x) = 0 by (B.68). Taking 
R =t shows that if f(y) = g(y) = 0 for all y such that |y — x| < t, then u(t,x) = 0. In other 
words, if f = g = 0 within Xp C % (defined as the t = 0 hyperplane Rj in R”+!), then u = 0 
within Dt (£). Equivalently, if u] = un and ù; = ün at Xo, then u] = u in DY (Xo). In case of 
the Einstein equations, u] = un becomes gı = g2 (isometrically), but otherwise the reasoning is 
similar, ultimately based on the property g; = g2 if both metrics are brought into the same gauge. 


67 This works if f,g € C?(IR"). For initial data f € H’(IR") and g € H*+!(IR”) one needs to approximate f and g 
within the spaces mentioned by sequences (fx) and (g,) in CX (IR”), respectively, upon which the initial conditions 
for (B.65) change into uz+1(0,x) = fk+1ı (x) and úg+1ı(0,x) = gx41 (x). 

680See Taylor (1996), Vol. I, chapter 4, Ringström (2009), chapter 15, or Choquet-Bruhat (2009), Appendix I. 

681 See Choquet-Bruhat (2009), Appendix III, Theorem 2.15. 
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transitive, 71 
adiabatic accessibility, 311 
affine parametrization, 50 
algebra, 32 
associative, 32 
commutative, 32 
Lie, 32 
Amstel Hotel, vii 
anchor (of Lie algebroid), 214 
anti de Sitter space, 70 
as space of constant curvature, 70 
not globally hyperbolic, 122 
apparent horizon, 306 
arc length, 50 
asymptotic (ADM) energy, 186 
asymptotic (ADM) momentum, 187 
asymptotically flat 
at null infinity, 266 
initial data set, 185 
Riemannian manifold, 185 
space-time, 185 
atlas, 31 
equivalent, 31 


B 
n-bein, 39 
Bekenstein—Hawking entropy, 310 
Bianchi identities, 61 
contracted, 153 
electromagnetism, 157 
bifurcation surface, 290 
Birkhoff’s theorem, 294 
black hole, 232, 270 
apparent horizon, 307 
area, 305 
Cauchy horizon, 284 
event horizon, 270, 284 
extremal, 242, 248 
Killing horizon, 289 
region, 270 
shadow, 225 
uniqueness theorems, 293, 301 
black hole thermodynamics, 309 
first law, 309, 314-316 
second law (= area law), 309, 312 


zeroth law, 309, 313 
boundary 


of manifold with boundary or corners, 44 


Boyer-Lindquist coordinates, 247 


C 
C*-structure, 31 
Cartan involution, 329 
Cartan’s formula, 149 
Cartan-Ambrose-Hicks theorem, 327 
Cauchy development, 165 
future, 114 
maximal (= MGHD), 166 
past, 114 
two-sided, 114 
Cauchy horizon, 116, 284 
future, 116, 284 
Kerr, 251 
of wannabe Cauchy surface, 284 
past, 116, 284 
Reissner-Nordström, 243 
Cauchy surface 
wannabe (= partial), 116, 284 


Cauchy surface (= Cauchy hypersurface), 113 


causal diamond, 110 
causal ladder, 110 
causal relations 
E+, 94 
I+, 94 
J*, 94 
Cayley transform, 258 
change of coordinates formula, 36 
characteristic initial value problem, 170 
characteristics, 170 
chart, 31 
Choquet-Bruhat, Yvonne (1923), vi, 24 
Choquet-Bruhat-Geroch theorem, 167 
Christoffel symbols, 50 
Circle Limit IV (Heaven and Hell), 69 
circularity theorem, 304 
Codazzi’s equation, 79 
collar neighbourhood theorem, 44, 298 
commutator, 32 
compact-open topology, 71 
concatenation of tensors, 43 
conformal 
compactification, 258 
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embedding, 258 
flatness, 75 
Killing operator, 198 
Laplacian, 197 
transformation, 75 
congruence (of curves) 
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expansion, 129 D’Alembertian, 159 
null, 137 Darmois identity, 179 
shear, 129 Darmois, Georges (1888-1960), 23 
timelike, 128 de Sitter space, 70, 125, 221 
vorticity, 129 as space of constant curvature, 70 
conjugate point, 102 Killing horizon, 222 
connection static patch, 221 
flat, 53 deformation algebra, 205 
Levi-Civita, 54 derivation, 32, 45 
linear, 52 point, 32 
metric, 55 derivative 
on a vector bundle, 55 classical, 334 
torsion-free, 53 weak, 334 
connection coefficients, 52, 55 development (of initial data), 276 
constant curvature, 68, 328 diffeomorphism, 31 
constraint diffeomorphism group, 31 
electromagnetism, 157 Dirichlet integral, 163 
Hamiltonian, 172, 181 distance, 49 
momentum, 172, 181 distribution, 333 
constraints tempered, 334 
general relativity, 159 divergence, 148 
contraction (of indices), 43 domain of dependence, 114, 284 
coordinate system, 31 domain of flow, 36 
coordinates, 31, 45 domain of influence, 284 
corner point, 44 domain of outer communication, 275 
coset space, 71 double cone, 110 
cosmic censorship dual 
PDE version, 276 basis, 43 
Penrose, 272, 273 vector space, 37 
cotangent bundle, 39, 45 dust, 155 
Cotton tensor, 75 
covariant approach E 
electromagnetism, 158 Eddington-Finkelstein coordinates, 230 
general relativity, 161 edge, 116, 284 
covariant derivative, 52 edgeless subset, 116 
covectors, 39 eikonal equation, 136 
cross-section, 33 Einstein field equations, 1, 15, 147 
curvature, 68 characteristic initial value problem, 170 
curvature tensor, 59 dynamical, 159, 181 
curve, 35 existence and uniqueness of solutions, 167 
affine parametrization, 50 non-characteristic initial value problem, 163 
arc length parametrization, 50 properties of solutions, 170 
causal, 94 Einstein flow, 203 
continuous causal, 107 Einstein manifold, 74 
endless, 106 Einstein metric, 74 
energy, 50 Einstein summation convention, 40, 45 
future extendible, 106 Einstein tensor, 74, 153 
future inextendible, 106 reduced, 160 
inextendible, 106 Einstein’s static universe, 264 
length, 49 Einstein, Albert (1879-1955), v, 1-6, 8-19, 22, 26-28, 
lightlike, 94 70, 126, 127, 155 
past extendible, 106 Einstein—Hilbert action, 147 
past inextendible, 106 Einstein—Rosen bridge, 237 
spacelike, 94 electric field, 157 
timelike, 94 electromagnetic field, 156 


electromagnetism, 157 
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end (of asymptotically flat space-time), 185 
energy conditions, 154 
dominant (DEC), 154 
null (NEC), 154 
strengthened dominant (SDEC), 154 
weak (WEC), 154 
energy density, 154 
energy inequality, 338 
energy-momentum four-vector, 154 
energy-momentum tensor, 154 
conservation law, 155 
dust, 155 
electromagnetic field, 156 
gravitational field, 189 
perfect fluid, 155 
scalar field, 156 
Entwurf Theorie, 12 
equation of geodesic deviation, 85 
equivalence principle, 4 
ergosphere, 252 
ergosurface 
inner, 252 
outer, 252 
Ernst equation, 304 
Ernst potential, 304 
Escher, Maurits Cornelis (1898-1972), vii, 69 
Euclidean group, 73, 318 
Euclidean space, 68 
Euler equations, 155 
Euler-Lagrange equations, 49 
event horizon, 284 
future, 270 
Kerr, 251 
past, 270 
Reissner-Nordström, 243 
Schwarzschild, 230 
exponential map, 67, 88 
exterior derivative, 39, 45, 147 
exterior multiplication, 147 
extrinsic curvature, 64, 79, 132 
mean, 132 


F 
p-form, 147 
1-form, 39, 45 
Fermi derivative, 133 
Fermi normal coordinates, 92 
fiber, 33 
final state conjecture, 293 
fine-tuning problem, 280 
first fundamental form, 64 
FLRW solution, 125 
focal point, 134, 139 
foliation, 175 

canonical, 206 
Fourier transform, 335 
frame, 39 


Frobenius theorem, 60 
function 
smooth, 31 
fundamental theorem for hypersurfaces, 80 
future set, 284 
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gauge condition 
Lorenz gauge, 158 
wave gauge, 159 
gauge invariance, 157 
Gauss equation, 79 
Gauss curvature, 65 
Gauss law, 157 
Gauss Lemma, 90 
Gauss, Carl Friedrich (1777-1855), 6, 59, 64, 66 
Gauss-Codazzi equations, 67, 79 
Gauss-Weingarten equations, 67, 79 
Geiser, Carl Friedrich (1843-1934), 6 
Gelfand triple, 334 
general covariance, 8-13, 15, 26-29, 190, 199 
generator of horizon, 286 
genericity condition (Hawking-Penrose), 146 
geodesic, 49, 53 
complete, 106 
deviation, 85 
equation, 50 
future complete, 106 
incomplete, 106 
normal coordinates, 89 
past complete, 106 
reflection, 327 
geodesically complete manifold, 51 
geometric uniqueness, 166 
global hyperbolicity, 110-113, 118, 121, 131, 143, 
146, 275 
and determinism, 280 
and strong cosmic censorship, 274, 276 
failure in anti de Sitter space-time, 122 
failure in Kerr space-time, 283 
failure in Reissner-Nordström space-time, 283 
globally hyperbolic development, 165 
maximal (= MGHD), 166 
graviton, 191 
Grossmann, Marcel (1878-1936), 6 
groupoid, 214 
action, 214 
Lie, 214 
pair, 214 


H 
h-arc length, 107 
Hamilton’s equations, 209 
Hamiltonian 

Lie algebra action, 209 
harmonic coordinates, 159 
harmonic map, 163 
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Hawking temperature, 310 
Hawking, Stephen (1942-2018), v, vii, 25, 111, 116, 
126, 257, 270, 310 
area law, 309 
rigidity theorem, 301 
singularity (incompleteness) theorem, 131 
Hawking-Penrose singularity theorem, 145 
Hilbert, David (1862-1943), 1, 8, 10, 13-15, 17-20, 
22, 23, 125, 147, 219, 224 
Hole Argument, 13 
homogeneous 
G-space, 321 
(semi) Riemannian manifold, 326 
space, 71, 72, 321 
horismos, 94 
horizon 
apparent, 307 
Cauchy, 116, 284 
event, 270, 284 
Killing, 289 
Huygens principle, 338 
strong, 338 
hyperboloid, 68 
hypersurface, 76 
null, 77 
spacelike, 77 
timelike, 77 


l 
ideal point, 274 
impact parameter, 226 
indices, 40 
inertial frame dragging, 249 
infinite redshift, 230 
insertion map, 147 
integration of curve, 36 
interior 

of manifold with boundary or corners, 44 
isometry, 57, 71 
isometry group, 71, 326 
isotropic space, 72 
isotropy representation, 322 
Israel’s theorem, 296 


J 

Jacobi equation, 85 
Jacobi field, 85 

Jacobi identity, 32, 319 


K 
Kerr metric 
extremal, 248 
rapidly rotating, 248 
slowly rotating, 248 
Kerr rigidity, 303 
Kerr stability, 303 
Kerr-Schild form, 251 


Killing horizon, 284, 289 
bifurcate, 223, 290 
de Sitter space-time, 223 
degenerate, 291 
Kerr space-time, 252 
non-degenerate, 291 
Reissner-Nordström space-time, 245 
Schwarzschild space-time, 222 
Killing vector field, 57 
Komar formulae, 248 
Koszul formula, 54 
Kretschmann scalar 
Kerr, 247 
Reissner-Nordström, 241 
Schwarzschild, 225 
Kruskal coordinates, 232 
Kruskal diagram, 233 
Kruskal-Szekeres coordinates, 234 
Kulkarni-Nomizu product, 75 


L 
Lambert W-function, 229 
Laplacian determinism, 279 
lapse, 121, 128, 175 
Leibniz rule, 32, 43, 52, 55 
Levi-Civita, Tulio (1873-1941), 8, 19 
Lichnerowicz, Andre (1915-1998), 24 
equation, 198 
theorem, 186 
Lie algebra, 32, 319 
action, 209 
structure constants, 209 
Lie algebroid, 214 
Lie derivative, 36, 43, 45 
Lie group, 317 
Lie product formula, 319 
Lie’s third theorem, 330 
Lie-Poisson bracket, 209 
lightcone, 93 
backward, 93 
forward, 93 
limit curve lemma, 109, 118 
Liouville’s theorem, 296 
Lorentz group, 190, 317 
proper orthochronous, 190 
Lorentzian cover, 70 
Lorentzian distance, 99 
Lorenz gauge, 158 
Lovelock’s Theorem, 150 
lowering and raising of indices, 48 


M 
Mach’s principle, 3 
manifest image, 175 
manifold 
C31 
geodesically complete, 51 
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locally flat, 60 
Lorentzian, 47 
orientable, 147 
Riemannian, 47 
semi-Riemannian, 47 
smooth, 31 
time orientable, 93 
topological, 31 
with boundary, 44 
with corners, 44 
map 
equivariant, 72 
smooth, 31 
maximal slicings, 197 
mean curvature, 65 
metric 
de Sitter, 125 
densitized, 162 
FLRW, 125 
Kerr, 247 
Kerr-Newman, 255 
Majumdar-Papapetrou, 299 
Minkowski, 47 
on vector bundle, 55 
on vector space, 37 
Papapetrou form, 304 
Reissner-Nordström, 240 
Schwarzschild, 125, 224 
metric tensor, 47 
Lorentzian, 47 
Riemannian, 47 
semi-Riemannian, 47 
MGHD = maximal globally hyperbolic development, 
166 
Minguzzi’s singularity theorem, 146 
minimal area enclosure, 308 
minimal coupling, 182 
minimal surface, 307 
outermost, 308 
Minkowski hypercylinder, 98 
Minkowski space-time, 47, 70 
Minkowski, Hermann (1864-1909), 17 
Misner, Charles (1932), vi, 126 
module, 32 
finitely generated projective, 33 
free, 33 
momentum density, 154 
momentum map, 209 
MOTS = marginally outer trapped surface, 307 


N 
nbhd = neighbourhood, 31 
neighbourhood 


convex, 88 

normal, 67, 88 

star-shaped, 88 
Noether’s theorem, 153 


Noether’stheorem, 209 
non-covariant approach 
electromagnetism, 158 
general relativity, 161, 175 
null curvature condition, 143 
null expansion, 139 
null infinity 
future, 259, 262, 270 
past, 259, 262, 270 


(0) 


optical function, 136 
orientation, 147 
orthonormal basis, 37 


P 
parallel transport, 52 
past set, 284 
PDE 
hyperbolic, 341 
quasi-linear, 341 
Penrose diagram, 262 
anti-de Sitter space, 265 
de Sitter space, 265 
Kerr space-time, 254, 255 
Kruskal space-time, 236, 269 
Minkowski space-time, 262 
Oppenheimer-Snyder space-time, 239 
Reissner-Nordström space-time, 240, 241, 245 
Schwarzschild space-time, 236 
Penrose inequality, 308 
Riemannian, 308 
Penrose process, 252, 312 
Penrose, Roger (1931), v-vii, 1, 22, 25, 127, 136, 145, 
170, 270 
final state conjecture, 293 
singularity (incompleteness) theorem, 127, 143 
strong cosmic censorship, 272, 273 
weak cosmic censorship, 272 
perfect fluid, 155 
photon, 191 
photon capture radius, 225 
photon sphere, 225, 227 
Plateau Problem, 197 
Poincare disc, 258 
Poincaré group, 73, 318 
Poincaré upper half-plane, 258 
point derivation, 45 
Poisson algebra, 208 
Poisson bracket, 208 
Poisson manifold, 208 
positive mass theorem, 187 
pregeodesic, 51 
problem of time, 217 
A-series, 217 
B-series, 217 
manifest time, 217 
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propagation of constraints 
electromagnetism, 158 
general relativity, 161 
propagation of gauge 
electromagnetism, 158 
general relativity, 161 
pullback 
of covector, 39 
of function, 35 
pushforward 
of point derivation, 35 
of tangent vector, 36 


R 
Raychaudhuri equation, 130 
null, 142 
reductive decomposition, 322 
Rellich theorem, 336 
rest photons, 231 
Ricci Flow, 188 
Ricci scalar, 48 
Ricci tensor, 74 
wave-gauged, 160 
Ricci-Curbastro, Gregorio (1853-1925), 8 
Riemann tensor, 60 
Riemann, Bernhard (1826-1866), 6, 7, 69 
Riemannian geometry, 6 
Riemannian manifold, 47 
asymptotically flat, 185 
Rindler horizon, 289 
Rindler wedge, 289 


S 
scalar curvature, 74 
scalar field, 156 
Schwarzschild radius, 224 
Schwarzschild solution, 125 
scientific image, 175 
second fundamental form, 64 
section of null hypersurface, 137 
sectional curvature, 63 
Seeley’s extension theorem, 44 
semi-colon notation, 56 
semidirect product, 318 
Serre-Swan Theorem, 33 
shift, 175 
signature of metric, 37 
singularity 
definition, 126, 127 
locally naked, 274 
naked, 242, 275 
ring, 247 
spacelike, 242 
timelike, 242 
singularity (incompleteness) theorem 
Chrusciel-Galloway, 146 
Eichmair-Galloway-Pollack, 146 


Fewster-Galloway, Fewster-Kontou, 146 


Freivogel-Kontou-Krommydas, 146 
Gannon-Lee (topological), 146 
Graf-Grant-Kunzinger-Steinbauer, 146 
Hawking, 131 
Hawking-Penrose, 145 
Lesourd, 146 
Minguzzi, 146 
Penrose, 127, 143 

Smarr’s formula, 312 

Sobolev duality theorem, 336 

Sobolev embedding theorem, 336 

Sobolev space, 335 

space, 31 
constant curvature, 68, 328 
locally symmetric, 327 
symmetric, 327 

space-time, 93 
anti de Sitter, 70 
asymptotically flat, 185 
asymptotically flat and stationary, 186 
asymptotically flat at null infinity, 266 
asymptotically simple, 259 
causal, 110 
causally incomplete, 127 
chronological, 110 
de Sitter, 70 
electrovac, 300 
extendible, 127 
future asymptotically predictable, 275 
globally hyperbolic, 110 
inextendible, 127 
Kerr, 247 
Kerr-Newman, 255 
Kruskal, 233 
Majumdar-Papapetrou, 299 
Minkowski, 47 
non-imprisoning, 110 
non-partially imprisoning, 110 
non-totally vicious, 121 
Oppenheimer-Snyder, 238 
past distinguishing, 273 
Quinten, 96 
reflecting, 121 
Reissner-Nordström, 242 
Schwarzschild, 230, 233 
singular, 127 
spherically symmetric, 294 
stably causal, 119 
static, 185 
stationary, 185 
strongly causal, 110 
totally vicious, 121 

spacelike infinity, 262, 270 

spatial projection, 129 

sphere, 68 
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stabilizer, 322 

static observer, 229 

staticity theorem, 303 

stationary limit surface, 252 

Stokes theorem, 149 

stress tensor, 154 

strong cosmic censorship 
PDE version, 277 
Penrose version, 274 

Strong Energy Condition (SEC), 131 

submanifold, 76 
k-dimensional, 76 
embedded, 76 
immersed, 76 

surface gravity, 290, 291 
Reissner-Nordström, 242 
Schwarzschild, 232 

symmetric space, 327 

symplectic quotient, 215 

symplectic reduction, 215 

Synge’s formula, 87, 228 


T 
tangent bundle, 33, 34, 45 

of manifold with boundary or corners, 44 
tangent vector, 45 
temporal function, 120 
tensor, 40 

of type (k,/), 45 
tensor field, 40 
tensor product, 37 
tensoring, 42 
test function, 333 

rapidly decreasing, 333 
tetrad, 39 
Theorema Egregium, 66 
time function, 119 
time orientation, 93 
timelike curvature condition, 131 
timelike infinity 

future, 262, 270 

past, 262, 270 
top element of poset, 167 
topological censorship, 302 
topological singularity theorem, 146 
torsion, 53 
total domain of dependence, 114 
total imprisonment, 110 
transverse traceless, 198 
trapped surface 

future, 140 

future outer, 307 


marginally outer, 307 

outer, 146 

weakly outer, 307 
trivial bundle, 33 


U 

uniform convergence, 108 
uniformization theorem, 196 
uniqueness theorems, 293 


V 


vacuum Einstein equations, 152 
vector 
“length”, 94 
causal, 93 
future-directed (fd), 93 
lightlike, 93 
null, 93 
past-directed (pd), 93 
spacelike, 93 
timelike, 93 
vector bundle, 33 
vector bundle map, 33 
vector field, 34, 45 
acceleration, 177 
complete, 36 
flow, 36 
Gaussian, 206 
Hamiltonian, 208 
vierbein, 39 


W 


wave coordinates, 159 
wave equations 
linear, 337 
quasi-linear, 341 
wave map, 163 
weak cosmic censorship 
PDE version, 277 
Penrose version, 275 
weak null singularities, 279 
Weingarten map, 64, 79 
Weyl tensor, 75 
Weyl, Hermann (1885-1955), v, 2, 20, 21, 23, 25, 224 
white hole, 232, 270 
white hole region, 270 
Wigner cocycle, 192 
wormhole, 237 


Y 
Yamabe class, 197 
Yamabe problem, 196 


This book, dedicated to Roger Penrose, is a second, mathematically 
oriented course in general relativity. It contains extensive references and 
occasional excursions in the history and philosophy of gravity, including 
a relatively lengthy historical introduction. The book is intended for all 
students of general relativity of any age and orientation who have a back- 
ground including at least first courses in special and general relativity, 
differential geometry, and topology. The material is developed in such a 
way that through the last two chapters the reader may acquire a taste 
of the modern mathematical study of black holes initiated by Penrose, 
Hawking, and others, as further influenced by the initial-value or PDE 
approach to general relativity. Successful readers might be able to begin 
reading research papers on black holes, especially in mathematical 
physics and in the philosophy of physics. The chapters are: Historical 
introduction, General differential geometry, Metric differential geometry, 
Curvature, Geodesics and causal structure, The singularity theorems of 
Hawking and Penrose, The Einstein equations, The 3+1 split of space- 
time, Black holes |: Exact solutions, and Black holes Il: General theory. 
These are followed by two appendices containing background on Lie 
groups, Lie algebras, & constant curvature, and on Formal PDE theory. 
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